class Grammar is Match {}
Every type declared with grammar
, and not explicitly stating its superclass, becomes a subclass of Grammar
.
grammar Identifier { token TOP { <initial> <rest>* } token initial { <+myletter +[_]> } token rest { <+myletter +mynumber +[_]> } token myletter { <[A..Za..z]> } token mynumber { <[0..9]> } } say Identifier.isa(Grammar); # OUTPUT: «True» my $match = Identifier.parse('W4anD0eR96'); say ~$match; # OUTPUT: «W4anD0eR96»
More documentation on grammars is available.
Methods§
method parse§
method parse($target, :$rule = 'TOP', Capture() :$args = \(), Mu :$actions = Mu, *%opt)
Parses the $target
, which will be coerced to Str
if it isn't one, using $rule
as the starting rule. Additional $args
will be passed to the starting rule if provided.
grammar RepeatChar { token start($character) { $character+ } } say RepeatChar.parse('aaaaaa', :rule('start'), :args(\('a'))); say RepeatChar.parse('bbbbbb', :rule('start'), :args(\('b'))); # OUTPUT: # 「aaaaaa」 # 「bbbbbb」
If the actions
named argument is provided, it will be used as an actions object, that is, for each successful regex match, a method of the same name, if it exists, is called on the actions object, passing the match object as the sole positional argument.
my $actions = class { method TOP($/) { say "7" } }; grammar { token TOP { a { say "42" } b } }.parse('ab', :$actions); # OUTPUT: «427»
Additional named arguments are used as options for matching, so you can for example specify things like :pos(4)
to start parsing from the fifth (:pos is zero-based) character. All matching adverbs are allowed, but not all of them take effect. There are several types of adverbs that a regex can have, some of which apply at compile time, like :s
and :i
. You cannot pass those to .parse
, because the regexes have already been compiled. But, you can pass those adverbs that affect the runtime behavior, such as :pos
and :continue
.
say RepeatChar.parse('bbbbbb', :rule('start'), :args(\('b')), :pos(4)).Str; # OUTPUT: «bb»
Method parse
only succeeds if the cursor has arrived at the end of the target string when the match is over. Use method subparse if you want to be able to stop in the middle.
The top regex in the grammar will be allowed to backtrack.
Returns a Match
on success, and Nil
on failure.
method subparse§
method subparse($target, :$rule = 'TOP', Capture() :$args = \(), Mu :$actions = Mu, *%opt)
Does exactly the same as method parse, except that cursor doesn't have to reach the end of the string to succeed. That is, it doesn't have to match the whole string.
Note that unlike method parse, subparse
always returns a Match
, which will be a failed match (and thus falsy), if the grammar failed to match.
grammar RepeatChar { token start($character) { $character+ } } say RepeatChar.subparse('bbbabb', :rule('start'), :args(\('b'))); say RepeatChar.parse( 'bbbabb', :rule('start'), :args(\('b'))); say RepeatChar.subparse('bbbabb', :rule('start'), :args(\('a'))); say RepeatChar.subparse('bbbabb', :rule('start'), :args(\('a')), :pos(3)); # OUTPUT: # 「bbb」 # Nil # #<failed match> # 「a」
method parsefile§
method parsefile(Str(Cool) $filename, :$enc, *%opts)
Reads file $filename
encoding by $enc
, and parses it. All named arguments are passed on to method parse.
grammar Identifiers { token TOP { [<identifier><.ws>]+ } token identifier { <initial> <rest>* } token initial { <+myletter +[_]> } token rest { <+myletter +mynumber +[_]> } token myletter { <[A..Za..z]> } token mynumber { <[0..9]> } } say Identifiers.parsefile('users.txt', :enc('UTF-8')) .Str.trim.subst(/\n/, ',', :g); # users.txt : # TimToady # lizmat # jnthn # moritz # zoffixznet # MasterDuke17 # OUTPUT: «TimToady,lizmat,jnthn,moritz,zoffixznet,MasterDuke17»