Raw string substitution problem
D'Arcy J.M. Cain
darcy at druid.net
Thu Dec 17 12:19:52 EST 2009
On Thu, 17 Dec 2009 11:51:26 -0500
Alan G Isaac <alan.isaac at gmail.com> wrote:
> >>> re.sub('abc', r'a\nb\n.c\a','123abcdefg') == re.sub('abc', 'a\\nb\\n.c\\a',' 123abcdefg') == re.sub('abc', 'a\nb\n.c\a','123abcdefg')
> True
Was this a straight cut and paste or did you make a manual change? Is
that leading space in the middle one a copying error? I get False for
what you actually have there for obvious reasons.
> >>> r'a\nb\n.c\a' == 'a\\nb\\n.c\\a' == 'a\nb\n.c\a'
> False
>
> Why are the first two strings being treated as if they are the last one?
They aren't. The last string is different.
>>> for x in (r'a\nb\n.c\a', 'a\\nb\\n.c\\a', 'a\nb\n.c\a'): print repr(x)
...
'a\\nb\\n.c\\a'
'a\\nb\\n.c\\a'
'a\nb\n.c\x07'
> That is, why isn't '\\' being processed in the obvious way?
> This still seems wrong. Why isn't it?
What do you think is wrong? What would the "obvious" way of handling
'//' be?
>
> More simply, consider::
>
> >>> re.sub('abc', '\\', '123abcdefg')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "C:\Python26\lib\re.py", line 151, in sub
> return _compile(pattern, 0).sub(repl, string, count)
> File "C:\Python26\lib\re.py", line 273, in _subx
> template = _compile_repl(template, pattern)
> File "C:\Python26\lib\re.py", line 260, in _compile_repl
> raise error, v # invalid expression
> sre_constants.error: bogus escape (end of line)
>
> Why is this the proper handling of what one might think would be an
> obvious substitution?
Is this what you want? What you have is a re expression consisting of
a single backslash that doesn't escape anything (EOL) so it barfs.
>>> re.sub('abc', r'\\', '123abcdefg')
'123\\defg'
--
D'Arcy J.M. Cain <darcy at druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
More information about the Python-list
mailing list