[Python-Dev] PEP-393/PEP-3118: unicode format specifiers

Nick Coghlan ncoghlan at gmail.com
Wed Mar 7 12:40:19 CET 2012


On Wed, Mar 7, 2012 at 8:50 PM, Stefan Krah <stefan at bytereef.org> wrote:
> *If* the arrays that Victor mentioned give one character per array location,
> then memoryview(str) could be used for zero-copy slicing etc.

A slight tangent, but it's worth trying to stick to the "code point"
term when talking about what Unicode strings contain. Even in UCS4,
full characters may be expressed as multiple code points (to be
honest, I still don't understand exactly how code points are composed
into graphemes and characters and mapped to glyphs for display, I just
know the mapping is a lot more complicated than the one-to-one implied
by referring to code points as characters).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list