[Python-Dev] \u and \U escapes in raw unicode string literals
M.-A. Lemburg
mal at egenix.com
Sat May 12 01:30:52 CEST 2007
On 2007-05-12 00:48, Martin v. Löwis wrote:
>> Using double backslashes won't cause that reaction:
>>
>> os.stat("c:\\windows\\system32\\user32.dll")
>
> Please refer to the subject. We are talking about raw strings.
If you'd leave the context in place, the reason for my suggestion
would become evident.
>>> Windows path names are one of the two primary applications of raw
>>> strings (the other being regexes).
>> IMHO the primary use case are regexps
>
> It's not a matter of opinion. It's a statistical fact that these
> are the two cases where people use raw strings most.
Ah, statistics :-) It always depends on who you ask: a Windows
user will obviously have more use for raw string use-case you
gave than a Unix user. At the end of the day, I still believe
that the regexp use-case is by far more common than the Windows
path name one.
FWIW: Zope has 2 uses of raw string for Windows path names (if I
counted correctly) and around 100 for regexps. Python itself
has maybe 10-20 Windows path name (and registry name) uses of
raw string (in the msi lib and distutils) vs. around 300 uses
for regexps.
>> and for those you'd
>> definitely want to be able to put Unicode characters into your
>> expressions.
>
> For regular expressions, you don't need them as part of the
> string literal syntax: The re parser itself could support \u,
> just like it supports \x today.
True and perhaps that's the right path to follow.
You'd still have the problem of writing Windows path names with
embedded Unicode characters, but I guess that's something we
can fix another day ;-)
>> BTW, if you use ur"..." for your expressions today (which you should
>> if you parse text), then nothing will change when removing the
>> 'u' prefix in Py3k.
>
> How do you know? Py3k hasn't been released, yet.
Sorry, I wasn't clear: if the raw-unicode-escape codec continues
to work the way it does not, you won't run into trouble in Py3k.
[and later:]
>> BTW, there's an easy work-around for this special case:
>> >
>> > os.stat(os.path.join(r"c:\windows\system32", "user32.dll"))
>
> No matter what the decision is, there are always work-arounds.
> The question is what language suits the users most. Being
> able to specify characters by ordinal IMO has much less value
> than the a consistent, concise definition of raw strings has.
I wonder how we managed to survive all these years with
the existing consistent and concise definition of the
raw-unicode-escape codec ;-)
There are two options:
* no one really uses Unicode raw strings nowadays
* none of the existing users has ever stumbled across the
"problem case" that triggered all this
Both ways, we're discussing a non-issue.
>> > I think it's nice that you can use forward slashes on Windows -
>> > makes writing code that works in both worlds (Unix and Windows)
>> > a lot easier.
>
> But, as Thomas says: you can't. You may be able to do so
> when using the API directly, however, it fails if you
> pass the file name in a command line of some tool that
> takes /foo to mean a command line option "foo".
Strange. I've doing exactly that for years. Maybe it's just
because I stick to common os module APIs. So far, I've never
run into any problem with it.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, May 12 2007)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
More information about the Python-Dev
mailing list