[Python-Dev] Re: pre-PEP [corrected]: Complete, Structured Regular Expression Group Matching

Fredrik Lundh fredrik at pythonware.com
Tue Aug 10 08:12:31 CEST 2004


Mike Coleman wrote:

> If I may wax anthropomorphic, the 're.match' function says to me as a programmer
>
>    *You* know what structure this RE represents, and *I* know what
>    structure it represents, too, because I had to figure it out to
>    do the match.

that only shows that you dont understand how regular expressions work.

a regular expression defines a set of strings, and the RE engine is designed to
tell you if a string you have is a member of this set.  the engine's not a parser,
and it has a very vague notion of "structure" (groups are implemented by
"marks", which are registers that keeps track of where the engine last passed
a given position; changing that to "lists of all possible matches" would require
a major rewrite).

you're probably better off using the scanner mechanism:

    http://effbot.org/zone/xml-scanner.htm

or the sre.Scanner class (see the source code).  the scanner concept could
need some documentation, and it would be nice to be able to switch patterns
during parsing.  (time for a scanner PEP, anyone?)

</F> 





More information about the Python-Dev mailing list