Re: [Python-Dev] unicode/string asymmetries

9 Jan 2002

      ...
jack wrote:
...
...
struct.pack("32s", wu(u"VS_VERSION_INFO"))
Why would you have to specify the encoding if what you want is the normal,
standard encoding?
because there is no such thing as a "normal, standard
encoding" for a unicode character, just like there's no
"normal, standard encoding" for an integer (big endian,
little endian?), a floating point number (ieee, vax, etc),
a screen coordinate, etc.
What I here call the "normal, standard encoding" is what the C library 
supports. Your analogy of integers and floats is exactly the right one: even 
though there are many ways to represent an integer what you get back from 
PyArg_Parse("l") is a standard C "long".

Maybe the confusion is that whereever I have said "unicode" in the past I 
should have said "wchar_t". I know there are, in theory, many encodings of 
Unicode but in practice there is only one that I'm interested in most of the 
time and that's wchar_t, because that's what all my APIs want.

So, I would like PyArg_Parse/Py_BuildValue formats that are symmetric to "s", 
"s#" and "z" but that return wchar_t strings and that work with both 
UnicodeObjects and StringObjects.
--
- Jack Jansen                http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma Goldman -