Using re.sub with %s

MRAB python at mrabarnett.plus.com
Wed Aug 18 14:50:20 EDT 2010


Thomas Jollans wrote:
> On Wednesday 18 August 2010, it occurred to Brandon Harris to exclaim:
>> Having trouble using %s with re.sub
>>
>> test = '/my/word/whats/wrong'
>> re.sub('(/)word(/)', r'\1\%s\2'%'1000', test)
>>
>> return is /my/@0/whats/wrong
>>
> 
> This has nothing to do with %, of course:
> 
>>>> re.sub('(/)word(/)', r'\1\%d\2'%1000, test)
> '/my/@0/whats/wrong'
>>>> re.sub('(/)word(/)', r'\1\1000\2', test)
> '/my/@0/whats/wrong'
> 
> let's see if we can get rid of that zero:
> 
>>>> re.sub('(/)word(/)', r'\1\100\2', test)
> '/my/@/whats/wrong'
> 
> so '\100' appears to be getting replaced with '@'. Why?
> 
>>>> '\100'
> '@'
> 
> This is Python's way of escaping characters using octal numbers.
> 
>>>> chr(int('100', 8))
> '@'
> 
> How to avoid this? Well, if you wanted the literal backslash, you'll need to 
> escape it properly:
> 
>>>> print(re.sub('(/)word(/)', r'\1\\1000\2', test))
> /my/\1000/whats/wrong
> 
> If you didn't want the backslash, then why on earth did you put it there? You 
> have to be careful with backslashes, they bite ;-)
> 
> Anyway, you can simply do the formatting after the match.
> 
>>>> re.sub('(/)word(/)', r'\1%d\2', test) % 1000
> '/my/1000/whats/wrong'
> 
> Or work with match objects to construct the resulting string by hand.
> 
You can stop group references which are followed by digits from turning
into octal escapes in the replacement template by using \g<n> instead:

 >>> print r'\1%s' % '00'
\100
 >>> print r'\g<1>%s' % '00'
\g<1>00



More information about the Python-list mailing list