Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx', r'\1'+str(4444), somevar)
Peter Otten
__peter__ at web.de
Fri Oct 23 07:54:39 EDT 2009
abdulet wrote:
> Well its this normal? i want to concatenate a number to a
> backreference in a regular expression. Im working in a multprocess
> script so the first what i think is in an error in the multiprocess
> logic but what a sorprise!!! when arrived to this conclussion after
> some time debugging i see that:
>
> import re
> aa = "zzz:xxx"
> re.sub(r'(zzz:).*',r'\1'+str(3333),aa)
> '[33'
If you perform the addition you get r"\13333". How should the regular
expression engine interpret that? As the backreference to group 1, 13, ...
or 13333? It picks something completely different, "[33", because "\133" is
the octal escape sequence for "[":
>>> chr(0133)
'['
You can avoid the ambiguity with
extra = str(number)
extra = re.escape(extra)
re.sub(expr r"\g<1>" + extra, text)
The re.escape() step is not necessary here, but a good idea in the general
case when extra is an arbitrary string.
Peter
More information about the Python-list
mailing list