When not to use an RE -- an example

Sat Apr 19 19:44:22 EDT 2003

sjmachin at lexicon.net (John Machin) writes:

> I needed a check for strings consisting of repeated characters -- like
> when users type "ZZZZZZZ" instead of "UNKNOWN" into a database field.
> After implementing the obvious overlapping-substring comparison, I got
> to thinking how this could be done with REs. The following resulted:
> 
> import re
> repeats1 = re.compile(r"^(?:(.)(?=\1))+\1\Z", re.DOTALL).match
> def repeats2(s):
>    return len(s) > 1 and s[1:] == s[:-1]
> for testvalue, expected in zip(
>    ['','x','xx','xxx','xxxxxx','xy','xxy','xyy','\n\n\n','aaa\n'],
>    [0,  0,  1,   1,    1,       0,   0,    0,    1,       0     ]):
>    print repr(testvalue), not not repeats1(testvalue),
> repeats2(testvalue), expected
> 
> Pergly/phugly, eh? Note the effort required to ensure the newline
> cases worked.

What's wrong with:

>>> matchRepetition = re.compile(r'(.)\1+\Z', re.DOTALL).match
>>> map(bool, map(matchRepetition,
...  ['','x','xx','xxx','xxxxxx','xy','xxy','xyy','\n\n\n','aaa\n']))
[0, 0, 1, 1, 1, 0, 0, 0, 1, 0]

?

At least it would seem to pass your testcases.

'as