[Python-Dev] PyArg_ParseTuple and unicode

M.-A. Lemburg mal@lemburg.com
Tue, 27 Nov 2001 16:34:36 +0100


Jack Jansen wrote:
> 
> I had expected the PyArg_ParseTuple() "u" specifier to automatically convert
> string objects to unicode strings with the default encoding (just as the
> reverse is true for "s"), but to my surprise it doesn't.
> 
> Is there a deep reason for this, or am I the first person to want this
> functionality? Or do I miss something and should I use a different format
> specifier?

Due to the problems around auto-conversion of objects to Unicode,
the current pattern to use is:

1. fetch the object using the "O" parameter marker and 
2. convert it to Unicode using PyObject_Unicode()

I could imagine extending the "u" parser marker to do the same
using an temporary Unicode object the contents of which are then
copied into a user provided buffer (much like what "es#" does).
Alternatively, we could extend the "e" parser marker with a
"u" target... this has the added benefit of being able
to define an encoding to use for dealing with non-Unicode 
string data.

If you think this is needed, please upload a feature request to
SF and assign it to me. I'll look into this after the feature
freeze then.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/