Nothing to repeat
Ian
hobson42 at gmail.com
Sun Jan 9 12:49:10 EST 2011
On 09/01/2011 16:49, Tom Anderson wrote:
> Hello everyone, long time no see,
>
> This is probably not a Python problem, but rather a regular
> expressions problem.
>
> I want, for the sake of arguments, to match strings comprising any
> number of occurrences of 'spa', each interspersed by any number of
> occurrences of the 'm'. 'any number' includes zero, so the whole
> pattern should match the empty string.
>
> Here's the conversation Python and i had about it:
>
> Python 2.6.4 (r264:75706, Jun 4 2010, 18:20:16)
> [GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import re
>>>> re.compile("(spa|m*)*")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/usr/lib/python2.6/re.py", line 190, in compile
> return _compile(pattern, flags)
> File "/usr/lib/python2.6/re.py", line 245, in _compile
> raise error, v # invalid expression
> sre_constants.error: nothing to repeat
>
> What's going on here? Why is there nothing to repeat? Is the problem
> having one *'d term inside another?
>
> Now, i could actually rewrite this particular pattern as '(spa|m)*'.
> But what i neglected to mention above is that i'm actually generating
> patterns from structures of objects (representations of XML DTDs, as
> it happens), and as it stands, patterns like this are a possibility.
>
> Any thoughts on what i should do? Do i have to bite the bullet and
> apply some cleverness in my pattern generation to avoid situations
> like this?
>
> Thanks,
> tom
>
I think you want to anchor your list, or anything will match. Perhaps
re.compile('/^(spa(m)+)*$/')
is what you need.
Regards
Ian
More information about the Python-list
mailing list