New re feature in 3.6: local flags and example use

Terry Reedy tjreedy at udel.edu
Sun Jan 1 21:25:25 EST 2017


The re module defines several flags that affect the compilation of a 
pattern string.  For instance, re.I == re.IGNORECASE results in 
case-insensitive matching.

But what if you want part of a pattern to be case sensitive and part 
not?  For instance, the IDLE colorizer needs to match keywords and 
builtin names with their exact case: 'if' is a keywork, 'If', 'iF', and 
'IF' are not.  However, case does not matter string prefixes: 'fr', 
'fR', 'Fr', and 'FR' are all the same to the tokenizer.  So the string 
prefix part of the colorize pattern was recently upgraded to

r"(\br|R|u|U|f|F|fr|Fr|fR|FR|rf|rF|Rf|RF|b|B|br|Br|bR|BR|rb|rB|Rb|RB)?"

3.6 added syntax for 'local flags'.
'''
(?imsx-imsx:...)

     (Zero or more letters from the set 'i', 'm', 's', 'x', optionally 
followed by '-' followed by one or more letters from the same set.) The 
letters set or removes the corresponding flags: re.I (ignore case), re.M 
(multi-line), re.S (dot matches all), and re.X (verbose), for the part 
of the expression. (The flags are described in Module Contents.)
'''
Here is the replacement for the above, curtesy of Serhiy Storchaka.

r"(?i:\br|u|f|fr|rf|b|br|rb)?"

-- 
Terry Jan Reedy



More information about the Python-list mailing list