Hi I’m developing an ANTLR4 grammar for PowerShell and am looking for the language specification for PowerShell 5 or later.
At the moment the grammar is based on Microsoft’s Powershell 3 specification doc (https://www.microsoft.com/en-us/download/details.aspx?id=36389).
In case there isn’t, I would be most thankful for:
A) Any specification, grammar, parsing rules or list of language changes from version 3 on.
B) A set of Powershell 5 or 6 samples so I can derive the rules from there if there is no language spec.
The PowerShell specification was never updated after V3, so your best resource is the actual PowerShell source (which is arguably a better resource given there is just one implementation of PowerShell.)
Most of the grammar in the language specification is extracted from the PowerShell source code. You can extract the grammar from the source by searching for comments starting with
//G, e.g. https://github.com/PowerShell/PowerShell/blob/037e12eddc850393d2538a26abad4a67e048a989/src/System.Management.Automation/engine/parser/Parser.cs#L766
Note that this grammar does not cover how to tokenize. Tokenization is context sensitive and not captured nicely in any sort of BNF. The language specification was a best useful effort, but is probably insufficient to build a 100% compatible PowerShell parser.
Tokenization of PowerShell is surprisingly hard in some places, hopefully having access to the source helps.