In Regexes§

See primary documentation in context for Unicode properties

The character classes mentioned so far are mostly for convenience; another approach is to use Unicode character properties. These come in the form <:property>, where property can be a short or long Unicode General Category name. These use pair syntax.

To match against a Unicode property you can use either smartmatch or uniprop:

"a".uniprop('Script');                 # OUTPUT: «Latin␤» 
"a" ~~ / <:Script<Latin>/;           # OUTPUT: «「a」␤» 
"a".uniprop('Block');                  # OUTPUT: «Basic Latin␤» 
"a" ~~ / <:Block('Basic Latin')> /;    # OUTPUT: «「a」␤»

These are the Unicode general categories used for matching:

ShortLong
LLetter
LCCased_Letter
LuUppercase_Letter
LlLowercase_Letter
LtTitlecase_Letter
LmModifier_Letter
LoOther_Letter
MMark
MnNonspacing_Mark
McSpacing_Mark
MeEnclosing_Mark
NNumber
NdDecimal_Number or digit
NlLetter_Number
NoOther_Number
PPunctuation or punct
PcConnector_Punctuation
PdDash_Punctuation
PsOpen_Punctuation
PeClose_Punctuation
PiInitial_Punctuation
PfFinal_Punctuation
PoOther_Punctuation
SSymbol
SmMath_Symbol
ScCurrency_Symbol
SkModifier_Symbol
SoOther_Symbol
ZSeparator
ZsSpace_Separator
ZlLine_Separator
ZpParagraph_Separator
COther
CcControl or cntrl
CfFormat
CsSurrogate
CoPrivate_Use
CnUnassigned

For example, <:Lu> matches a single, uppercase letter.

Its negation is this: <:!property>. So, <:!Lu> matches a single character that is not an uppercase letter.

Categories can be used together, with an infix operator:

OperatorMeaning
+set union
\-set difference

To match either a lowercase letter or a number, write <:Ll+:N> or <:Ll+:Number> or <+ :Lowercase_Letter + :Number>.

It's also possible to group categories and sets of categories with parentheses; for example:

say $0 if 'raku9' ~~ /\w+(<:Ll+:N>)/ # OUTPUT: «「9」␤»