Nothing to repeat

Tom Anderson twic at urchin.earth.li
Sun Jan 9 11:49:35 EST 2011


Hello everyone, long time no see,

This is probably not a Python problem, but rather a regular expressions 
problem.

I want, for the sake of arguments, to match strings comprising any number 
of occurrences of 'spa', each interspersed by any number of occurrences of 
the 'm'. 'any number' includes zero, so the whole pattern should match the 
empty string.

Here's the conversation Python and i had about it:

Python 2.6.4 (r264:75706, Jun  4 2010, 18:20:16)
[GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.compile("(spa|m*)*")
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib/python2.6/re.py", line 190, in compile
     return _compile(pattern, flags)
   File "/usr/lib/python2.6/re.py", line 245, in _compile
     raise error, v # invalid expression
sre_constants.error: nothing to repeat

What's going on here? Why is there nothing to repeat? Is the problem 
having one *'d term inside another?

Now, i could actually rewrite this particular pattern as '(spa|m)*'. But 
what i neglected to mention above is that i'm actually generating patterns 
from structures of objects (representations of XML DTDs, as it happens), 
and as it stands, patterns like this are a possibility.

Any thoughts on what i should do? Do i have to bite the bullet and apply 
some cleverness in my pattern generation to avoid situations like this?

Thanks,
tom

-- 
If it ain't broke, open it up and see what makes it so bloody special.



More information about the Python-list mailing list