[Python-ideas] Support Unicode code point notation

Chris “Kwpolska” Warrick kwpolska at gmail.com
Sun Jul 28 10:06:32 CEST 2013


A bit of clarification:

On Sat, Jul 27, 2013 at 5:47 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Aside: you keep writing H..HHHHHH for Unicode code points. Unicode code
> points go up to hex 10FFFF, so an absolute maximum of six digits, not seven
> or more as you keep writing (four times, not that I'm counting :-)

My fancy syntax meant “up to six hex digits”.  And 10FFFF is six digits long.

~~~

On Sun, Jul 28, 2013 at 1:14 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> The Ruby \U{...} syntax has the following advantages:

It’s \u{}.  "\U{}" results in "U{}", i.e. does not work.

> * No fixed limit on number of digits

Are we still speaking of the Ruby implementation?

irb(main):002:0> "\u{1234567}"
SyntaxError: (irb):2: invalid Unicode codepoint (too large)
"\u{1234567}"
    ^
        from /usr/bin/irb:12:in `<main>'

-- 
Chris “Kwpolska” Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense


More information about the Python-ideas mailing list