[Python-Dev] unicode/string asymmetries

Martin v. Loewis martin@v.loewis.de
Tue, 8 Jan 2002 23:17:53 +0100


> > > 1. No support for unicode in the struct and array modules.
> > > Is this an oversight?
> > 
> > I'd call it intentional. What exactly would you like to happen?
> 
> I would like to create struct's containing unicode characters
> (be gentle with me, maybe I mean wide characters, or mbcs, but I'm really
> not sure)

Well, that is precisely the problem: When putting a Unicode object
into a C structure, there are too many alternatives to pick a sensible
default. It is not even clear what a "wide character" is: it mide be a
value of wchar_t, or it might be a value of Py_UNICODE (those differ
on Unix, in the default installation). 

For "MBCS", the most reasonable default might be "utf-8", since this
capable of encoding all characters. On Windows, "mbcs" is also a good
choice, since it uses the encoding that all character API uses.

Why are you asking? Do you have a specific implementation in mind, or
are you just worried that Unicode objects cannot be put into
structures? Don't worry, file objects cannot be put into structures,
either :-)

Regards,
Martin