Raw string substitution problem
Rhodri James
rhodri at wildebst.demon.co.uk
Thu Dec 17 19:59:12 EST 2009
On Thu, 17 Dec 2009 20:18:12 -0000, Alan G Isaac <alan.isaac at gmail.com>
wrote:
> So is the bottom line the following?
> A string replacement is not just "converted"
> as described in the documentation, essentially
> it is compiled?
That depends entirely on what you mean.
> But that cannot quite be right. E.g., \b will be a back
> space not a word boundary. So then the question arises
> again, why isn't '\\' a backslash? Just because?
> Why does it not get the "obvious" conversion?
'\\' *is* a backslash. That string containing a single backslash is then
processed by the re module which sees a backslash, tries to interpret it
as an escape, fails and barfs.
"re.compile('a\\nc')" passes a sequence of four characters to re.compile:
'a', '\', 'n' and 'c'. re.compile() then does it's own interpretation:
'a' passes through as is, '\' flags an escape which combined with 'n'
produces the newline character (0x0a), and 'c' passes through as is.
"re.compile('a\nc')" by contrast passes a sequence of three character to
re.compile: 'a', 0x0a and 'c'. re.compile() does it's own interpretation,
which happens not to change any of the characters, resulting in the same
regular expression as before.
Your problem is that you are conflating the compile-time processing of
string literals with the run-time processing of strings specific to re.
--
Rhodri James *-* Wildebeeste Herder to the Masses
More information about the Python-list
mailing list