Possible to insert variables into regular expressions?

Steven Bethard steven.bethard at gmail.com
Fri Dec 10 20:57:03 CET 2004


Terry Hancock wrote:
> And hey, you could probably use a regex to modify a regex, if you were
> really twisted. ;-)
> 
> Sorry.  I really shouldn't have said that. Somebody's going to do it now. :-P

Sure, but only 'cause you asked so nicely. =)

 >>> import re
 >>> def internationalize(expr,
...                      letter_matcher=re.compile(r'\[A-(?:Za-)?z\]')):
...     return letter_matcher.sub(r'[^\W_\d]', expr)
...
 >>> def compare(expr, text):
...     def item_str(matcher):
...         return ' '.join(matcher.findall(text))
...     print 'reg: ', item_str(re.compile(expr))
...     print 'intl:', item_str(re.compile(internationalize(expr),
...                                        re.UNICODE))
...
 >>> compare(r'\d+\s+([A-z]+)', '1 viola. 2 voilà')
reg:  viola voil
intl: viola voilà
 >>> compare(r'\d+\s+([A-Za-z]+)', '1 viola. 2 voilà')
reg:  viola voil
intl: viola voilà

This code converts [A-z] style regexps to a regexp that is suitable for 
use with other encodings.  Note that without the conversion, characters 
like 'à' are not found.

Steve



More information about the Python-list mailing list