[Python-Dev] unicode/string asymmetries

Jack Jansen jack@oratrix.nl
Thu, 10 Jan 2002 01:17:57 +0100


Recently, "M.-A. Lemburg" <mal@lemburg.com> said:
> How about this: we add a wchar_t codec to Python and the "eu#" parser
> marker. Then you could write:
> 
> 	wchar_t value = NULL;
> 	int len = 0;
> 	if (PyArg_ParseTuple(tuple, "eu#", "wchar_t", &value, &len) < 0)
>                 return NULL;

I like it! Even though I have to do the memory management myself (and
have to think of the error case) it at least looks reasonable. I'm
assuming here that if I pass a StringObject it will be unicode-encoded
using the default encoding, and that unicode value will then be
converted to wchar_t and put in value, right? Or, in other words,
passing "a.out" will do the same as passing u"a.out"...

One minor misgiving is that this call will *always* copy the string,
even if the internal coding of unicode objects is wchar_t. That's a
bit of a nuisance, but we can try to fix that later.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma Goldman -