Different number of matches from re.findall and re.split
jlconlin at gmail.com
Mon Jan 11 10:49:28 EST 2010
On Jan 11, 8:44 am, Iain King <iaink... at gmail.com> wrote:
> On Jan 11, 3:35 pm, Jeremy <jlcon... at gmail.com> wrote:
> > Hello all,
> > I am using re.split to separate some text into logical structures.
> > The trouble is that re.split doesn't find everything while re.findall
> > does; i.e.:
> > > found = re.findall('^ 1', line, re.MULTILINE)
> > > len(found)
> > 6439
> > > tables = re.split('^ 1', line, re.MULTILINE)
> > > len(tables)
> > > 1
> > Can someone explain why these two commands are giving different
> > results? I thought I should have the same number of matches (or maybe
> > different by 1, but not 6000!)
> > Thanks,
> > Jeremy
> re.split doesn't take re.MULTILINE as a flag: it doesn't take any
> flags. It does take a maxsplit parameter, which you are passing the
> value of re.MULTILINE (which happens to be 8 in my implementation).
> Since your pattern is looking for line starts, and your first line
> presumably has more splits than the maxsplits you are specifying, your
> re.split never finds more than 1.
Yep. Thanks for pointing that out. I guess I just assumed that
re.split was similar to re.search/match/findall in what it accepted as
function parameters. I guess I'll have to use a \n instead of a ^ for
More information about the Python-list
mailing list