Raw string substitution problem
Alan G Isaac
alan.isaac at gmail.com
Thu Dec 17 10:08:00 EST 2009
> En Wed, 16 Dec 2009 11:09:32 -0300, Ed Keith <e_d_k at yahoo.com> escribió:
>
>> I am having a problem when substituting a raw string. When I do the
>> following:
>>
>> re.sub('abc', r'a\nb\nc', '123abcdefg')
>>
>> I get
>>
>> """
>> 123a
>> b
>> cdefg
>> """
>>
>> what I want is
>>
>> r'123a\nb\ncdefg'
On 12/16/2009 9:35 AM, Gabriel Genellina wrote:
> From http://docs.python.org/library/re.html#re.sub
>
> re.sub(pattern, repl, string[, count])
>
> ...repl can be a string or a function; if
> it is a string, any backslash escapes in
> it are processed. That is, \n is converted
> to a single newline character, \r is
> converted to a linefeed, and so forth.
>
> So you'll have to double your backslashes:
I'm not persuaded that the docs are clear. Consider:
>>> 'ab\\ncd' == r'ab\ncd'
True
Naturally enough. So I think the right answer is:
1. this is a documentation bug (i.e., the documentation
fails to specify unexpected behavior for raw strings), or
2. this is a bug (i.e., raw strings are not handled correctly
when used as replacements)
I vote for 2.
Peter's use of a function highlights just how odd this is:
getting the raw string via a function produces a different
result than providing it directly. If this is really the
way things ought to be, I'd appreciate a clear explanation
of why.
Alan Isaac
More information about the Python-list
mailing list