[Python-Dev] The C API and wide unicode support
Michael Hudson
mwh@python.net
11 Jul 2002 13:01:46 +0100
"M.-A. Lemburg" <mal@lemburg.com> writes:
> > Prediction: this is going to cause pain. For instance, if this user
> > decides that he wants to upgrade to 2.2.1, he might download Sean's
> > RPMs from python.org which are narrow unicode builds -- and then his
> > extensions will break. The problem here is that the kind of users
> > this is going to trouble are exactly the users who will not know
> > what's going on.
>
> It's a pain, yes, but still better than having seg faults
> due to memory corruption afterwords.
Probably true. At least the tracebacks make the problem obvious.
> > We can't prevent this sort of thing totally, but I think it should be
> > possible to carry out simple unicode manipulations (like this example
> > of returning a buffer) without incurring this kind of binary
> > compatibility worry. Maybe a "safe" api, plastered with warning signs
> > in the docs about poking into the internal structure of the objects.
>
> Perhaps we need an additional abstract API PyObject_UnicodeEx()
> which provides a way to additionally define the encoding to assume
> for decoding string objects ? (PyObject_Unicode() always assumes
> the default encoding)
That would be nice, yes. Beats digging "unicode" out of
__builtin__...
> > [*] actually, I think pygame might break with a wide unicode build.
>
> Why's that ?
Oh, the obvious thing: assuming sizeof(Py_UNICODE) == 2; or rather
assuming that Python's idea of what a unicode buffer is is the same as
SDL's idea (why I can't find written down anywhere, but I assume it's
the same kind of UCS-2 thing narrow builds use).
So, I retract my complaint, and propose to write some docs on the
subject.
Cheers,
M.
--
Two things I learned for sure during a particularly intense acid
trip in my own lost youth: (1) everything is a trivial special case
of something else; and, (2) death is a bunch of blue spheres.
-- Tim Peters, 1 May 1998