In Regexes§
See primary documentation in context for Regex interpolation
Instead of using a literal pattern for a regex match, you can use a variable that holds that pattern. This variable can then be 'interpolated' into a regex, such that its appearance in the regex is replaced with the pattern that it holds. The advantage of using interpolation this way, is that the pattern need not be hardcoded in the source of your Raku program, but may instead be variable and generated at runtime.
There are four different ways of interpolating a variable into a regex as a pattern, which may be summarized as follows:
Syntax | Description |
---|---|
$variable | Interpolates stringified contents of variable literally. |
$(code) | Runs Raku code inside the regex, and interpolates the stringified return value literally. |
<$variable> | Interpolates stringified contents of variable as a regex. |
<{code}> | Runs Raku code inside the regex, and interpolates the stringified return value as a regex. |
Instead of the $
sigil, you may use the @
sigil for array interpolation. See below for how this works.
Let's start with the first two syntactical forms: $variable
and $(code)
. These forms will interpolate the stringified value of the variable or the stringified return value of the code literally, provided that the respective value isn't a Regex
object. If the value is a Regex
, it will not be stringified, but instead be interpolated as such. 'Literally' means strictly literally, that is: as if the respective stringified value is quoted with a basic Q
string Q[...]
. Consequently, the stringified value will not itself undergo any further interpolation.
For $variable
this means the following:
my = 'Is this a regex or a string: 123\w+False$pattern1 ?';my = 'string';my = '\w+';my = 123;my = /\w+/;say .match: / 'string' /; # [1] OUTPUT: «「string」»say .match: / $pattern1 /; # [2] OUTPUT: «「string」»say .match: / $pattern2 /; # [3] OUTPUT: «「\w+」»say .match: / $regex /; # [4] OUTPUT: «「Is」»say .match: / $number /; # [5] OUTPUT: «「123」»
In this example, the statements [1]
and [2]
are equivalent and meant to illustrate a plain case of regex interpolation. Since unescaped/unquoted alphabetic characters in a regex match literally, the single quotes in the regex of statement [1]
are functionally redundant; they have merely been included to emphasize the correspondence between the first two statements. Statement [3]
unambiguously shows that the string pattern held by $pattern2
is interpreted literally, and not as a regex. In case it would have been interpreted as a regex, it would have matched the first word of $string
, i.e. 「Is」
, as can be seen in statement [4]
. Statement [5]
shows how the stringified number is used as a match pattern.
This code exemplifies the use of the $(code)
syntax:
my = 'Is this a regex or a string: 123\w+False$pattern1 ?';my = 'string';my = 'gnirts';my = '$pattern1';my = True;my sub f1 ;say .match: / $pattern3.flip /; # [6] OUTPUT: «Nil»say .match: / "$pattern3.flip()" /; # [7] OUTPUT: «「string」»say .match: / $($pattern3.flip) /; # [8] OUTPUT: «「string」»say .match: / $([~] $pattern3.comb.reverse) /; # [9] OUTPUT: «「string」»say .match: / $(!$bool) /; # [10] OUTPUT: «「False」»say .match: / $pattern4 /; # [11] OUTPUT: «「$pattern1」»say .match: / $(f1) /; # [12] OUTPUT: «「$pattern1」»
Statement [6]
does not work as probably intended. To the human reader, the dot .
may seem to represent the method call operator, but since a dot is not a valid character for an ordinary identifier, and given the regex context, the compiler will parse it as the regex wildcard . that matches any character. The apparent ambiguity may be resolved in various ways, for instance through the use of straightforward string interpolation from the regex as in statement [7]
(note that the inclusion of the call operator ()
is key here), or by using the second syntax form from the above table as in statement [8]
, in which case the match pattern string
first emerges as the return value of the flip
method call. Since general Raku code may be run from within the parentheses of $( )
, the same effect can also be achieved with a bit more effort, like in statement [9]
. Statement [10]
illustrates how the stringified version of the code's return value (the Boolean value False
) is matched literally.
Finally, statements [11]
and [12]
show how the value of $pattern4
and the return value of f1
are not subject to a further round of interpolation. Hence, in general, after possible stringification, $variable
and $(code)
provide for a strictly literal match of the variable or return value.
Now consider the second two syntactical forms from the table above: <$variable>
and <{code}>
. These forms will stringify the value of the variable or the return value of the code and interpolate it as a regex. If the respective value is a Regex
, it is interpolated as such:
my = 'Is this a regex or a string: 123\w+$x ?';my = '\w+';my = 123;my sub f1 ;say .match: / /; # OUTPUT: «「Is」»say .match: / /; # OUTPUT: «「123」»say .match: / /; # OUTPUT: «「string」»
Importantly, 'to interpolate as a regex' means to interpolate/insert into the target regex without protective quoting. Consequently, if the value of the variable $variable1
is itself of the form $variable2
, evaluation of <$variable1>
or <{ $variable1 }>
inside a target regex /.../
will cause the target regex to assume the form /$variable2/
. As described above, the evaluation of this regex will then trigger further interpolation of $variable2
:
my = Q[Mindless \w+ $variable1 $variable2];my = Q[\w+];my = Q[$variable1];my sub f1 ;# /<{ f1 }>/ ==> /$variable2/ ==> / '$variable1' /say .match: / /; # OUTPUT: «「$variable1」»# /<$variable2>/ ==> /$variable1/ ==> / '\w+' /say .match: //; # OUTPUT: «「\w+」»# /<$variable1>/ ==> /\w+/say .match: //; # OUTPUT: «「Mindless」»
When an array variable is interpolated into a regex, the regex engine handles it like a |
alternative of the regex elements (see the documentation on embedded lists, above). The interpolation rules for individual elements are the same as for scalars, so strings and numbers match literally, and Regex
objects match as regexes. Just as with ordinary |
interpolation, the longest match succeeds:
my = '2', 23, rx/a.+/;say ('b235' ~~ / b @a /).Str; # OUTPUT: «b23»
If you have an expression that evaluates to a list, but you do not want to assign it to an @-sigiled variable first, you can interpolate it with @(code)
. In this example, both regexes are equivalent:
my = a => 1, b => 2;my = .keys;say S:g/@(%h.keys)// given 'abc'; # OUTPUT: «12c>say S:g/@a// given 'abc'; # OUTPUT: «12c>
The use of hashes in regexes is reserved.