Regular expression bug?

Fri Feb 20 14:11:11 EST 2009

More elegant way

>>> [x for x in re.split('([A-Z]+[a-z]+)', a) if x ]
['foo', 'Bar', 'Baz']

R.

On Feb 20, 2:03 pm, Lie Ryan <lie.1... at gmail.com> wrote:
> On Thu, 19 Feb 2009 13:03:59 -0800, Ron Garret wrote:
> > In article <gnkdal$bcq$0... at news.t-online.com>,
> >  Peter Otten <__pete... at web.de> wrote:
>
> >> Ron Garret wrote:
>
> >> > I'm trying to split a CamelCase string into its constituent
> >> > components.
>
> >> How about
>
> >> >>> re.compile("[A-Za-z][a-z]*").findall("fooBarBaz")
> >> ['foo', 'Bar', 'Baz']
>
> > That's very clever.  Thanks!
>
> >> > (BTW, I tried looking at the source code for the re module, but I
> >> > could not find the relevant code.  re.split calls
> >> > sre_compile.compile().split, but the string 'split' does not appear
> >> > in sre_compile.py.  So where does this method come from?)
>
> >> It's coded in C. The source is Modules/sremodule.c.
>
> > Ah.  Thanks!
>
> > rg
>
> This re.split() doesn't consume character:
>
> >>> re.split('([A-Z][a-z]*)', 'fooBarBaz')
>
> ['foo', 'Bar', '', 'Baz', '']
>
> it does what the OP wants, albeit with extra blank strings.