regular expression: perl ==> python
Nick Craig-Wood
nick at craig-wood.com
Thu Dec 23 01:48:52 EST 2004
Fredrik Lundh <fredrik at pythonware.com> wrote:
> that's not a very efficient way to match multiple patterns, though. a
> much better way is to combine the patterns into a single one, and use
> the "lastindex" attribute to figure out which one that matched.
lastindex is useful, yes.
> see
>
> http://effbot.org/zone/xml-scanner.htm
>
> for more on this topic.
I take your point. However I don't find the below very readable -
making 5 small regexps into 1 big one, plus a game of count the
brackets doesn't strike me as a huge win...
xml = re.compile(r"""
<([/?!]?\w+) # 1. tags
|&(\#?\w+); # 2. entities
|([^<>&'\"=\s]+) # 3. text strings (no special characters)
|(\s+) # 4. whitespace
|(.) # 5. special characters
""", re.VERBOSE)
Its probably faster though, so I give in gracelessly ;-)
--
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
More information about the Python-list
mailing list