Powershell, kind of set intersection built-in?

Question:

For some game where one would need to find anagrams from a bunch of loose letters I ended up implementing a permutation algorithm to find all possible anagrams and filter those if needed for known letter positions (-match is great, by the way). But for longer words this proved very much error-prone, as skimming a large list of gibberish doesn’t really reveal the proper words that were hidden within.

So I thought that if I would have a large list of English words (should be obtainable somewhere) I could just intersect my list of permutations with the list of proper words and get (hopefully) all real words from the permutation list.

Since many operators in PS work differently with collections I thought I could just do something like

and get the intersection back. Unfortunately it’s not that easy. Other options I have thought of would be to iterate over one list and do a -contains for each item:

This probably would work but is also very slow, I think (especially when $wordlist is the result of a gc wordlist.txt). Or I could build a gigantic regular expression:

But that would probably not be very fast either. I could maybe also use findstr with above gigantic regex but that feels just wrong.

Are there any built-in solutions I could use and that are better than my attempts so far? Otherwise I’d probably put the word list into a hashtable and use the iterative -contains approach which should be fast enough then.

Answer:


(borrowing New-HashSet from Josh Einstein)

Warning: those methods on HashSet are in-place algorithms that modify the original collection. If you want functional-style transform on immutable objects, you’ll need to bring LINQ to the party:

Clearly, someone needs to wrap this static-generic-reflection crap into a friendly cmdlet! Don’t worry, I’m working on it…

Source:

Powershell, kind of set intersection built-in? by licensed under CC BY-SA | With most appropriate answer!

Leave a Reply