Using re.sub with %s
MRAB
python at mrabarnett.plus.com
Wed Aug 18 14:50:20 EDT 2010
Thomas Jollans wrote:
> On Wednesday 18 August 2010, it occurred to Brandon Harris to exclaim:
>> Having trouble using %s with re.sub
>>
>> test = '/my/word/whats/wrong'
>> re.sub('(/)word(/)', r'\1\%s\2'%'1000', test)
>>
>> return is /my/@0/whats/wrong
>>
>
> This has nothing to do with %, of course:
>
>>>> re.sub('(/)word(/)', r'\1\%d\2'%1000, test)
> '/my/@0/whats/wrong'
>>>> re.sub('(/)word(/)', r'\1\1000\2', test)
> '/my/@0/whats/wrong'
>
> let's see if we can get rid of that zero:
>
>>>> re.sub('(/)word(/)', r'\1\100\2', test)
> '/my/@/whats/wrong'
>
> so '\100' appears to be getting replaced with '@'. Why?
>
>>>> '\100'
> '@'
>
> This is Python's way of escaping characters using octal numbers.
>
>>>> chr(int('100', 8))
> '@'
>
> How to avoid this? Well, if you wanted the literal backslash, you'll need to
> escape it properly:
>
>>>> print(re.sub('(/)word(/)', r'\1\\1000\2', test))
> /my/\1000/whats/wrong
>
> If you didn't want the backslash, then why on earth did you put it there? You
> have to be careful with backslashes, they bite ;-)
>
> Anyway, you can simply do the formatting after the match.
>
>>>> re.sub('(/)word(/)', r'\1%d\2', test) % 1000
> '/my/1000/whats/wrong'
>
> Or work with match objects to construct the resulting string by hand.
>
You can stop group references which are followed by digits from turning
into octal escapes in the replacement template by using \g<n> instead:
>>> print r'\1%s' % '00'
\100
>>> print r'\g<1>%s' % '00'
\g<1>00
More information about the Python-list
mailing list