[Python-ideas] Support Unicode code point notation

Chris Angelico rosuav at gmail.com
Sat Jul 27 18:03:22 CEST 2013


On Sat, Jul 27, 2013 at 4:47 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Unicode's notation is nice and simple. If we had it first, would we prefer
> \uxxxx and \U00xxxxxx over it? I don't think so.

Almost certainly not. Like I said, I think your idea is great *in a
vacuum*. Obviously the removal of the current notations is out of the
question, which means that this is yet another way to specify a
codepoint; and it's one that most programmers won't be looking for. (I
stand corrected, though: I had thought that there were *no* other
languages using this notation. Of course, this is a silly thought.
There is almost nothing that hasn't already been done, somewhere.)

If Python had supported this notation from the beginning of Unicode
strings, or at least since 3.0, then adding \uxxxx would have been
purely as a sop to C/Java/etc programmers, and it would likely have
gone nowhere. How much value is gained by creating a new syntax, which
now Python programmers have to understand in addition to the existing
ones? Consistency across languages is fairly important; have you ever
used \123 notation in a BIND file?

http://rosuav.blogspot.com/2012/12/i-want-my-octal.html

Maybe Python will start a new trend, and \U+1234 will become the new
convention. Maybe that's a good thing. But how beneficial will it be,
and how complicating?

I'm weakening my stance to -0.

ChrisA


More information about the Python-ideas mailing list