[issue12789] re.Scanner don't support more then 2 groups on regex

Matthew Barnett report at bugs.python.org
Sat Aug 20 18:57:59 CEST 2011


Matthew Barnett <python at mrabarnett.plus.com> added the comment:

Even if this bug is fixed, it still won't work as you expect, and this s why.

The Scanner function accepts a list of 2-tuples. The first item of the tuple is a regex and the second is a function. For example:

    re.Scanner([(r"\d+", number), (r"\w+", word)])

The Scanner function then builds a regex, using the given regexes as alternatives, each wrapped as a capture group:

    r"(\d+)|(\w+)"

When matching, it sees which group captured and uses that to decide which function it should call, so, for example, if group 1 matched, it calls "number", and if group 2 matched, it calls "word".

When you introduce capture groups into the regexes, it gets confused. If your regex matches, it'll see that groups 1 and 2 match, so it'll try to call the second function, but there's isn't one...

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12789>
_______________________________________


More information about the Python-bugs-list mailing list