When does the escape character work within raw strings?
steve at REMOVE-THIS-cybersource.com.au
Fri May 22 17:29:16 CEST 2009
On Fri, 22 May 2009 07:47:49 -0700, walterbyrd wrote:
> On May 21, 9:44 pm, "Rhodri James" <rho... at wildebst.demon.co.uk> wrote:
>> Escaping the delimiting quote is the *one* time backslashes have a
>> special meaning in raw string literals.
> If that were true, then wouldn't r'\b' be treated as two characters?
>> This calls re.sub with a pattern string object that contains two
>> characters, a backslash followed by an 'n'. This combination *does*
>> have a special meaning to the sub function, which does it's own
>> translation of the pattern into a single newline character.
> So when do I know when a raw string is treated as a raw string, and when
> it's not?
You have misunderstood. All strings are strings, but there are different
ways to build a string. Raw strings are not different from ordinary
strings, they're just a different way to *build* an ordinary string.
Here are four ways to make the same string, a backslash followed by a
"\\b" # use an ordinary string, and escape the backslash
chr(92)+"b" # use the chr() function
"\x5cb" # use a hex escape
r"\b" # use a raw string, no escaping needed
The results you get from all of those (and many, many more!) are the same
string object. They're just written differently as source code.
Now, in regular expressions, the RE engine expects to see special codes
inside the string that have special meanings. For example, backslash
followed by lowercase B has a special meaning. So to create a string
containing that regex, you can use any of the above (or any of the
others). The RE engine doesn't know, and can't know, how you generated
the regex. All it sees is a string containing a backslash followed by
But if you forget that Python uses backslash escapes in strings, and just
write "\b", then the compiler creates the string chr(8) (BEL), which has
no special meaning to the RE engine.
More information about the Python-list