[Tutor] backreferences - \0
Steven D'Aprano
steve at pearwood.info
Sun Jun 6 10:26:18 CEST 2010
On Sun, 6 Jun 2010 05:32:26 pm Payal wrote:
> Hi,
> A newbie re query.
>
> >>> import re
> >>> s='one'
> >>> re.sub('(one)','only \\0',s)
>
> 'only \x00'
>
> >>> re.sub('(one)','only \0',s)
>
> 'only \x00'
>
> I expected the output to be 'only one' with \0 behaving like "&" in
> sed. What is wrong with my syntax?
Two things. Firstly, the Python regex engine numbers backreferences from
1, not 0, so you need \1 and not \0.
Secondly, you neglected to escape the escape, so Python interpreted the
string "only \0" as "only " plus the ASCII null byte, which has no
special meaning to the regex engine.
Fixing both those problems, you can either escape the escapes, which is
tedious for large regexes:
>>> re.sub('(one)', 'only \\1', s)
'only one'
or better, use the raw string syntax so Python doesn't interpret
backslashes specially:
>>> re.sub('(one)', r'only \1', s)
'only one'
--
Steven D'Aprano
More information about the Tutor
mailing list