How do I get to *all* of the groups of an re search?
hst at empolis.co.uk
Fri Jan 10 16:01:49 CET 2003
Cameron Laird wrote:
> In article <sl51f-6dj.ln1 at news.lairds.org>,
> Kyler Laird <Kyler at news.Lairds.org> wrote:
> > http://www.python.org/doc/current/lib/re-syntax.html
> > (...)
> > Matches whatever regular expression is inside the
> > parentheses, and indicates the start and end of a
> > group; the contents of a group can be retrieved
> > after a match has been performed, [...]
> >Sounds good, so I tried it.
> > import re
> > text = 'foo foo1 foo2 bar bar1 bar2 bar3'
> > test_re = re.compile('([a-z]+)( \\1[0-9]+)+')
> > print test_re.findall(text)
> >I expected the matches to be something like
> > [('foo', [' foo1', ' foo2']), ('bar', [' bar1', '
> bar2', ' bar3'])]
> >but it's just this.
> > [('foo', ' foo2'), ('bar', ' bar3')]
> >How do I get to the other groups that were matched? (Is this
> >an FAQ? I don't know where to start looking.)
> Oh, it's matching all the groups. Does the code below help
> explain why?
> I'm clumsy with REs--I don't immediately see how to achieve
> your desired result. I can quickly observe that
> import re
> text = 'foo foo1 foo2 bar bar1 bar2 bar3'
> test_re = re.compile('([a-z]+)(( \\1[0-9]+)+)')
> print test_re.findall(text)
> [('foo', ' foo1 foo2', ' foo2'), ('bar', ' bar1 bar2 bar3',
> ' bar3')]
> One of us will probably get an RE that properly listifies these
> within the next day ...
You can't return a variable number of groups from a regex. The number of groups returned is always the number of (capturing) groups in the regex. However,
t = 'foo foo1 foo2 bar bar1 bar2 bar3 singleton'
e = re.compile('([a-z]+)((?: +\\1[0-9]+)*)')
print [[x] + x.split() for x in e.findall(t)]
[['foo', 'foo1', 'foo2'], ['bar', 'bar1', 'bar2', 'bar3'], ['singleton']]
which seems pretty close to what you want.
This message has been checked for all known viruses by the MessageLabs Virus Scanning Service.
More information about the Python-list