[Python-Dev] \u and \U escapes in raw unicode string literals

M.-A. Lemburg mal at egenix.com
Fri May 11 13:27:26 CEST 2007

On 2007-05-11 13:05, Thomas Heller wrote:
> M.-A. Lemburg schrieb:
>> On 2007-05-11 07:52, Martin v. Löwis wrote:
>>>> This is what prompted my question, actually: in Py3k, in the
>>>> str/unicode unification branch, r"\u1234" changes meaning: before the
>>>> unification, this was an 8-bit string, where the \u was not special,
>>>> but now it is a unicode string, where \u *is* special.
>>> That is true for non-raw strings also: the meaning of "\u1234" also
>>> changes.
>>> However, traditionally, there was *no* escaping mechanism in raw strings
>>> in Python, and I feel that this is a good principle, because it is
>>> easy to learn (if you leave out the detail that \ can't be the last
>>> character in a raw string - which should get fixed also, IMO). So I
>>> think in Py3k, "\u1234" should continue to be a string with 6
>>> characters. Otherwise, people will complain that
>>> os.stat(r"c:\windows\system32\user32.dll") fails. Telling them to write
>>> os.stat(r"c:\windows\system32\u005Cuser32.dll") will just cause puzzled
>>> faces.
>> Using double backslashes won't cause that reaction:
>> os.stat("c:\\windows\\system32\\user32.dll")
> Sure.  But I want to use raw strings for Windows path names; it's much easier
> to type.

But think of the price to pay if we disable use of Unicode
escapes in raw strings. And all of this just because of the
one special case: having a file name that starts with a U
and needs to be referenced literally in a Python application
together with a path leading up to it.

BTW, there's an easy work-around for this special case:

os.stat(os.path.join(r"c:\windows\system32", "user32.dll"))

>> Also note that Windows is smart enough nowadays to parse
>> the good old Unix forward slash:
>> os.stat("c:/windows/system32/user32.dll")
> In my opinion this is a windows bug and not a features.  Especially because there
> are Windows api functions (the shell functions, IIRC) that do NOT accept
> forward slashes.
> Would you say that *nix is dumb because it doesn't parse "\\usr\\include"?

Sorry, I wasn't trying to imply that Windows is/was a dumb system.

I think it's nice that you can use forward slashes on Windows -
makes writing code that works in both worlds (Unix and Windows)
a lot easier.

>>> Windows path names are one of the two primary applications of raw
>>> strings (the other being regexes).

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, May 11 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

More information about the Python-Dev mailing list