[New-bugs-announce] [issue18614] Enhanced \N{} escapes for Unicode strings

Steven D'Aprano report at bugs.python.org
Thu Aug 1 15:54:05 CEST 2013

New submission from Steven D'Aprano:

As per the discussion here:


\N{} escapes should support the Unicode code point notation U+xxxx (where there are four, five or six hex digits after the U+).

E.g. '\N{U+03BB}' => 'λ'

unicodedata.lookup should also support such numeric names, e.g.:

unicodedata.lookup('U+03BB') => 'λ'

As '+' is otherwise prohibited in Unicode character names, there should never be ambiguity between 'U+xxxx' as a code point and an actual name, and a single lookup function can handle both.

(See http://www.unicode.org/versions/Unicode6.2.0/ch04.pdf#G39 for details on characters allowed in names.)

Also add a function for the reverse

unicodedata.codepoint('λ') => 'U+03BB'

def codepoint(c):
    return 'U+{:04X}'.format(ord(c))

components: Unicode
messages: 194075
nosy: ezio.melotti, stevenjd
priority: normal
severity: normal
status: open
title: Enhanced \N{} escapes for Unicode strings
type: enhancement
versions: Python 3.4

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list