Regular expression bug?
Ron Garret
rNOSPAMon at flownet.com
Thu Feb 19 16:08:19 EST 2009
In article <mailman.273.1235071607.11746.python-list at python.org>,
Albert Hopkins <marduk at letterboxes.org> wrote:
> On Thu, 2009-02-19 at 10:55 -0800, Ron Garret wrote:
> > I'm trying to split a CamelCase string into its constituent components.
> > This kind of works:
> >
> > >>> re.split('[a-z][A-Z]', 'fooBarBaz')
> > ['fo', 'a', 'az']
> >
> > but it consumes the boundary characters. To fix this I tried using
> > lookahead and lookbehind patterns instead, but it doesn't work:
>
> That's how re.split works, same as str.split...
I think one could make the argument that 'foo'.split('') ought to return
['f','o','o']
>
> > >>> re.split('((?<=[a-z])(?=[A-Z]))', 'fooBarBaz')
> > ['fooBarBaz']
> >
> > However, it does seem to work with findall:
> >
> > >>> re.findall('(?<=[a-z])(?=[A-Z])', 'fooBarBaz')
> > ['', '']
>
>
> Wow!
>
> To tell you the truth, I can't even read that...
It's a regexp. Of course you can't read it. ;-)
rg
More information about the Python-list
mailing list