[Python-3000] How will unicode get used?

Josiah Carlson jcarlson at uci.edu
Sun Sep 24 23:54:21 CEST 2006


Fredrik Lundh <fredrik at pythonware.com> wrote:
> Martin v. Löwis wrote:
> > I think supporting multiple representations at run-time would really
> > be terrible. Any API of the "give me the data" kind would either have
> > to expose the choice of representations, or perform a copy.
> 
> Unless you can guarantee that *all* external API:s that a Python 
> extension might want to use will use exactly the same internal 
> representation as Python, that's something that we have to deal with anyway.

I think Martin meant with regards to, for example, choosing an internal
Latin-1, UCS-2, or UCS-4 representation based on the code points of the
string.

I stated earlier that with a buffer interface that returned the *size*
of elements, users could program based on internal representation, but I
agree that it would be error prone.


What if we just chose UTF-16 as an internal representation?  No
defualt system encoding version attached (as it is right now). Extension
writers could write for the single representation, and convert if it
isn't what they want (and where is the default system encoding ever what
is desired?)


 - Josiah



More information about the Python-3000 mailing list