Raw string substitution problem

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed Dec 16 14:23:59 EST 2009


En Wed, 16 Dec 2009 14:51:08 -0300, Peter Otten <__peter__ at web.de>  
escribió:

> Ed Keith wrote:
>
>> --- On Wed, 12/16/09, Gabriel Genellina <gagsl-py2 at yahoo.com.ar> wrote:
>>
>>> Ed Keith <e_d_k at yahoo.com>
>>> escribió:
>>>
>>> > I am having a problem when substituting a raw string.
>>> When I do the following:
>>> >
>>> > re.sub('abc', r'a\nb\nc', '123abcdefg')
>>> >
>>> > I get
>>> >
>>> > """
>>> > 123a
>>> > b
>>> > cdefg
>>> > """
>>> >
>>> > what I want is
>>> >
>>> > r'123a\nb\ncdefg'
>>>
>>> So you'll have to double your backslashes:
>>>
>>> py> re.sub('abc', r'a\\nb\\nc', '123abcdefg')
>>> '123a\\nb\\ncdefg'
>>>
>> That is going to be a nontrivial exercise. I have control over the
>> pattern, but the texts to be substituted and substituted into will be  
>> read
>> from user supplied files. I need to reproduce the exact text the is read
>> from the file.
>
> There is a helper function re.escape() that you can use to sanitize the
> substitution:
>
>>>> print re.sub('abc', re.escape(r'a\nb\nc'), '123abcdefg')
> 123a\nb\ncdefg

Unfortunately re.escape does much more than that:

py> print re.sub('abc', re.escape(r'a.b.c'), '123abcdefg')
123a\.b\.cdefg

I think the string_escape encoding is what the OP needs:

py> print re.sub('abc', r'a\n(b.c)\nd'.encode("string_escape"),  
'123abcdefg')
123a\n(b.c)\nddefg

-- 
Gabriel Genellina




More information about the Python-list mailing list