[I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide"
Mon, 02 Jul 2001 07:25:55 -0700
"M.-A. Lemburg" wrote:
> > Character
> > Used by itself, means the addressable units of a Python
> > Unicode string.
> Please add: also known as "code unit".
I'm not entirely comfortable with that. As you yourself pointed out, the
same Python Unicode object can be interpreted as either a series of
single-width code points *or* as a UTF-16 string where the characters
are code units. You could also interpet it as a BASE64'd region or an
XML document... It all depends on how you look at it.
> > Surrogate pair
> > Two physical characters that represent a single logical
> Eeek... two code units (or have you ever seen a physical character
> walking around ;-)
No, that's sort of my point. The user can decide to adopt the convention
of looking at the two characters as code units or they can ignore that
interpretation and look at them as two code points. It's all relative,
man. Dig it? That's why I use the word "convention" below:
> > character. Part of a convention for representing 32-bit
> > code points in terms of two 16-bit code points.
"Surrogates are all in your head. Python doesn't know or care about
I'll change this to:
Two Python Unicode characters that represent a single logical
Unicode code point. Part of a convention for representing
32-bit code points in terms of two 16-bit code points. Python
has limited support for reading, writing and constructing
that use this convention (described below). Otherwise Python
ignores the convention.
> No need to pass this information to the codec: simply write
> a new one and give it a clear name, e.g. "ucs-2" will generate
> errors while "utf-16-le" converts them to surrogates.
That's a good point, but what if I want a UTF-8 codec that doesn't
generate surrogates? Or even a UCS4 one?
> Plus perhaps the Mark Davis paper at:
> > Copyright
> > This document has been placed in the public domain.
> Good work, Paul !
Thanks for your help. You did help me to clarify many things even though
I argued with you as I was doing it.
Take a recipe. Leave a recipe.
Python Cookbook! http://www.ActiveState.com/pythoncookbook