[Python-Dev] \u and \U escapes in raw unicode string literals

M.-A. Lemburg mal at egenix.com
Sun May 13 14:25:01 CEST 2007

On 2007-05-12 02:42, Andrew McNabb wrote:
> On Sat, May 12, 2007 at 01:30:52AM +0200, M.-A. Lemburg wrote:
>> I wonder how we managed to survive all these years with
>> the existing consistent and concise definition of the
>> raw-unicode-escape codec ;-)
>> There are two options:
>>  * no one really uses Unicode raw strings nowadays
>>  * none of the existing users has ever stumbled across the
>>    "problem case" that triggered all this
>> Both ways, we're discussing a non-issue.
> Sure, it's a non-issue for Python 2.x.  However, when Python 3 comes
> along, and all strings are Unicode, there will likely be a lot more
> users stumbling into the problem case.

In the first case, changing the codec won't affect much code when
ported to Py3k.

In the second case, a change to the codec is not necessary.

Please also consider the following:

* without the Unicode escapes, the only way to put non-ASCII
  code points into a raw Unicode string is via a source code encoding
  of say UTF-8 or UTF-16, pretty much defeating the original
  requirement of writing ASCII code only

* non-ASCII code points in text are not uncommon, they occur
  in most European scripts, all Asian scripts,
  many scientific texts and in also texts meant for the web
  (just have a look at the HTML entities, or think of Word
  exports using quotes)

* adding Unicode escapes to the re module will break code
  already using "...\u..." in the regular expressions for
  other purposes; writing conversion tools that detect this
  usage is going to be hard

* OTOH, writing conversion tools that simply work on string
  literals in general is easy

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, May 13 2007)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

More information about the Python-Dev mailing list