[Python-Dev] Discordance in documentation...

Jeff Epler jepler at unpythonic.net
Thu Sep 4 23:13:41 EDT 2003


On Thu, Sep 04, 2003 at 10:04:48PM +0200, gminick wrote:
> ...or is this just me?
> 
> Let's take a look, Reference Lib, 4.2.1 Regular Expression Syntax says:
> 
>    "|"
>            A|B, where A and B can be arbitrary REs, creates a regular
>            expression that will match either A or B.
>            [...]
>            REs separated by "|" are tried from left to right, and the 
>            first one that allows the complete pattern to match is considered 
>            the accepted branch. This means that if A matches, B will never 
>            be tested, even if it would produce a longer overall match. [...]
> 
> And now a little test:
[snipped]

Here's how the "tried left to right" portion has a meaning.  Consider
the following stupid RE:
	"(.)|(a)"

>>> import re
>>> r = re.compile("(.)|(a)")
>>> m = r.search("a")
>>> m
<_sre.SRE_Match object at 0x81c9458>
>>> m.group(1)
'a'
>>> m.group(2)
>>> # None

Nothing you can do will ever make the second group match.  But if you
write this:
	"(a)|(.)"
then the string "a" will match on the left-hand side (this is
guaranteed) but the string "b" will match the second group.

If this thread is to be continued, it should be on python-list at python.org

Jeff





More information about the Python-list mailing list