[Python-Dev] Unicode and Windows

M.-A. Lemburg mal@lemburg.com
Tue, 21 Mar 2000 18:44:11 +0100

"Fred L. Drake, Jr." wrote:
> M.-A. Lemburg writes:
>  > And/or perhaps sepcific APIs for each OS... e.g.
>  >
>  >      PyOS_MBCSFromObject() (only on WinXX)
>  >      PyOS_AppleFromObject() (only on Mac ;)
>   Another approach may be to add some format modifiers:
>         te -- text in an encoding specified by a C string (somewhat
>               similar to O&)
>         tE -- text, encoding specified by a Python object (probably a
>               string passed as a parameter or stored from some other
>               call)
>   (I'd prefer the [eE] before the t, but the O modifiers follow, so
> consistency requires this ugly construct.)
>   This brings up the issue of using a hidden conversion function which
> may create a new object that needs the same lifetime guarantees as the
> real parameters; we discussed this issue a month or two ago.
>   Somewhere, there's a call context that includes the actual parameter
> tuple.  PyArg_ParseTuple() could have access to a "scratch" area where
> it could place objects constructed during parameter parsing.  This
> area could just be a hidden tuple.  When the C call returns, the
> scratch area can be discarded.
>   The difficulty is in giving PyArg_ParseTuple() access to the scratch
> area, but I don't know how hard that would be off the top of my head.

Some time ago, I considered adding "U+" with builtin auto-conversion
to the tuple parser... after some discussion about the error
handling issues involved with this I quickly dropped that idea
again and used the standard "O" approach plus a call to a helper
function which then applied the conversion.

(Note the "+" behind "U": this was intended to indicate that the
returned object has had the refcount incremented and that the
caller must take care of decrementing it again.)

The "O" + helper approach is a little clumsy, but works
just fine. Plus it doesn't add any more overhead to the
already convoluted PyArg_ParseTuple().

BTW, what other external char formats are we talking about ?
E.g. how do you handle MBCS or DBCS under WinXX ? Are there
routines to have wchar_t buffers converted into the two ?

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/