[Python-Dev] Unicode support in getargs.c
Jack Jansen
Jack.Jansen@cwi.nl
Wed, 2 Jan 2002 22:46:46 +0100 (CET)
On Wed, 2 Jan 2002, Martin v. Loewis wrote:
> > Jack will probably also need a way to say "decode this encoded
> > object into Unicode using the encoding xyz". Something like the
> > Unicode version of "es#". How about "eu#" which then passes through
> > Unicode as-is while decoding all other objects according to the
> > given encoding ?!
>
> I'd like to see the requirements, in terms of real-world problems,
> before considering any extensions.
I have a number of MacOSX API's that expect Unicode buffers, passed as
"long count, UniChar *buffer". I have the machinery in bgen to generate
code for this, iff "u#" (or something else) would work the same as "s#",
i.e. it returns you a pointer and a size, and it would work equally well
for unicode objects as for classic strings (after conversion).
The trick with O and using PyUnicode_FromObject() may do the trick for me,
as my code is generated, so a little more glue call doesn't really matter.
But as a general solution it doesn't look right: "How do I call a C
routine with a string parameter?" "Use the "s" format and you get the
string pointer to pass". "How do I call a C routine with a unicode string
parameter?" "Use O and PyUnicode_FromObject() and PyUnicode_AsUnicode and
make sure you get all your decrefs right and.....".
The "es#" is a very strange beast, and a similar "eu#" would help me a
little, but it has some serious drawbacks. Aside from it being completely
different from the other converters (being a prefix operator in stead of a
postfix one, and having a value-return argument) I would also have to
pre-allocate the buffer in advance, and that sort of defeats the purpose.
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm