Raw string substitution problem
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Wed Dec 16 14:23:59 EST 2009
En Wed, 16 Dec 2009 14:51:08 -0300, Peter Otten <__peter__ at web.de>
escribió:
> Ed Keith wrote:
>
>> --- On Wed, 12/16/09, Gabriel Genellina <gagsl-py2 at yahoo.com.ar> wrote:
>>
>>> Ed Keith <e_d_k at yahoo.com>
>>> escribió:
>>>
>>> > I am having a problem when substituting a raw string.
>>> When I do the following:
>>> >
>>> > re.sub('abc', r'a\nb\nc', '123abcdefg')
>>> >
>>> > I get
>>> >
>>> > """
>>> > 123a
>>> > b
>>> > cdefg
>>> > """
>>> >
>>> > what I want is
>>> >
>>> > r'123a\nb\ncdefg'
>>>
>>> So you'll have to double your backslashes:
>>>
>>> py> re.sub('abc', r'a\\nb\\nc', '123abcdefg')
>>> '123a\\nb\\ncdefg'
>>>
>> That is going to be a nontrivial exercise. I have control over the
>> pattern, but the texts to be substituted and substituted into will be
>> read
>> from user supplied files. I need to reproduce the exact text the is read
>> from the file.
>
> There is a helper function re.escape() that you can use to sanitize the
> substitution:
>
>>>> print re.sub('abc', re.escape(r'a\nb\nc'), '123abcdefg')
> 123a\nb\ncdefg
Unfortunately re.escape does much more than that:
py> print re.sub('abc', re.escape(r'a.b.c'), '123abcdefg')
123a\.b\.cdefg
I think the string_escape encoding is what the OP needs:
py> print re.sub('abc', r'a\n(b.c)\nd'.encode("string_escape"),
'123abcdefg')
123a\n(b.c)\nddefg
--
Gabriel Genellina
More information about the Python-list
mailing list