Regexp syntax change in 1.6?

Gareth McCaughan Gareth.McCaughan at
Sat Sep 9 00:03:56 CEST 2000

Adam Sampson wrote:

> Under Python 1.5.2, I had a script containing the following line:
> m = re.match(r"[a-z0-9]*://[^/]+/.*\.([^.#\?/]*)([#\?]?.*)?", url)
> (Bonus points for guessing what it does; answer down the bottom.)
> Under 1.6, this fails with:
> sre_constants.error: nothing to repeat 
> I can narrow it down to:
> >>> import re
> >>> m = re.match(r"(x?)?", url)
> sre_constants.error: nothing to repeat 
> whereas:
> >>> m = re.match(r"(x?.)?", url)
> works fine. Is this correct behaviour for SRE, or am I just being stupid?
> "(x?)?" looks like a perfectly reasonable Perl-style regexp to me (and Perl
> too)...

Well, (x?)? should be equivalent to (x)? or (x?), so
perhaps it's reasonable to be issued a warning. An
outright error seems rather harsh.

For your actual case, the closing


(contents of charset changed for clarity) could be replaced


without any loss. (If ([xyz]?.*)? matches then either
([xyz]?.*) matches or an empty string does; but an
empty string also matches ([xyz]?.*). The only scope
for a difference is in whether the corresponding
match group gets '' or None; but it turns out that
in Python 1.5.2 it gets '' anyway, just as it does
with the "simplified" RE that I suggest.)

Gareth McCaughan  Gareth.McCaughan at
sig under construction

More information about the Python-list mailing list