[Tutor] backreferences - \0

Steven D'Aprano steve at pearwood.info
Sun Jun 6 10:26:18 CEST 2010


On Sun, 6 Jun 2010 05:32:26 pm Payal wrote:
> Hi,
> A newbie re query.
>
> >>> import re
> >>> s='one'
> >>> re.sub('(one)','only \\0',s)
>
> 'only \x00'
>
> >>> re.sub('(one)','only \0',s)
>
> 'only \x00'
>
> I expected the output to be 'only one' with \0 behaving like "&" in
> sed. What is wrong with my syntax?

Two things. Firstly, the Python regex engine numbers backreferences from 
1, not 0, so you need \1 and not \0.

Secondly, you neglected to escape the escape, so Python interpreted the 
string "only \0" as "only " plus the ASCII null byte, which has no 
special meaning to the regex engine.

Fixing both those problems, you can either escape the escapes, which is 
tedious for large regexes:

>>> re.sub('(one)', 'only \\1', s)
'only one'

or better, use the raw string syntax so Python doesn't interpret 
backslashes specially:

>>> re.sub('(one)', r'only \1', s)
'only one'



-- 
Steven D'Aprano


More information about the Tutor mailing list