[Python-Dev] \u and \U escapes in raw unicode string literals
M.-A. Lemburg
mal at egenix.com
Fri May 11 13:27:26 CEST 2007
On 2007-05-11 13:05, Thomas Heller wrote:
> M.-A. Lemburg schrieb:
>> On 2007-05-11 07:52, Martin v. Löwis wrote:
>>>> This is what prompted my question, actually: in Py3k, in the
>>>> str/unicode unification branch, r"\u1234" changes meaning: before the
>>>> unification, this was an 8-bit string, where the \u was not special,
>>>> but now it is a unicode string, where \u *is* special.
>>> That is true for non-raw strings also: the meaning of "\u1234" also
>>> changes.
>>>
>>> However, traditionally, there was *no* escaping mechanism in raw strings
>>> in Python, and I feel that this is a good principle, because it is
>>> easy to learn (if you leave out the detail that \ can't be the last
>>> character in a raw string - which should get fixed also, IMO). So I
>>> think in Py3k, "\u1234" should continue to be a string with 6
>>> characters. Otherwise, people will complain that
>>> os.stat(r"c:\windows\system32\user32.dll") fails. Telling them to write
>>> os.stat(r"c:\windows\system32\u005Cuser32.dll") will just cause puzzled
>>> faces.
>> Using double backslashes won't cause that reaction:
>>
>> os.stat("c:\\windows\\system32\\user32.dll")
>
> Sure. But I want to use raw strings for Windows path names; it's much easier
> to type.
But think of the price to pay if we disable use of Unicode
escapes in raw strings. And all of this just because of the
one special case: having a file name that starts with a U
and needs to be referenced literally in a Python application
together with a path leading up to it.
BTW, there's an easy work-around for this special case:
os.stat(os.path.join(r"c:\windows\system32", "user32.dll"))
>> Also note that Windows is smart enough nowadays to parse
>> the good old Unix forward slash:
>>
>> os.stat("c:/windows/system32/user32.dll")
>
> In my opinion this is a windows bug and not a features. Especially because there
> are Windows api functions (the shell functions, IIRC) that do NOT accept
> forward slashes.
>
> Would you say that *nix is dumb because it doesn't parse "\\usr\\include"?
Sorry, I wasn't trying to imply that Windows is/was a dumb system.
I think it's nice that you can use forward slashes on Windows -
makes writing code that works in both worlds (Unix and Windows)
a lot easier.
>>> Windows path names are one of the two primary applications of raw
>>> strings (the other being regexes).
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, May 11 2007)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
More information about the Python-Dev
mailing list