On Thu, Aug 1, 2013 at 7:20 PM, Nick Coghlan
I'd never even heard of code point labels before this thread, while the "U+" notation is incredibly common.
Nick, Did you see this part: "A constructed code point label is distinguished from the designation of the code point itself (for example, “U+0009” or “U+FFFF”), which is also a unique identifier"? The purpose of unicode.lookup() is to look up the unicode code point by name and "U+NNNN" is not a name - it is "the designation of the code point itself." There is no need to look up anything if you want to process an occasional s = "U+FFFF" string: chr(int(s[2:], 16) ) will do the job. The original proposal was to allow \U+NNNN escape as a shortcut for \U0000NNNN. This is a clear readability improvement while \N{U+001B}, for example, is not an improvement over \N{ESCAPE}. However, for more obscure control characters, \N{control-NNNN} may be clearer than any currently available spelling. For example, \N{control-001E} is easier to understand than \036, \x1e, \u001E, \N{RS} or even the most verbose \N{INFORMATION SEPARATOR TWO}.