Regex for String Literals
Tim Peters
tim.one at comcast.net
Mon Sep 2 17:46:56 EDT 2002
[Stefan Franke]
> Does someone know a regular expression that matches all
> kinds of Python string literals (along with their finer points
> WRT line breaks, unicode..)?
tokenize.py (in the std library) strives to match the Python compiler's
tokenization exactly. You'll find a suitable collection of hairy regexps
there, but, if you can, find a way to *use* tokenize.py directly. Using the
generator interface this is less mind-bending than it used to be (you can
iterate over a token stream instead of fighting with stateful callback
functions).
More information about the Python-list
mailing list