[Python-Dev] New Py_UNICODE doc

Nicholas Bastin nbastin at opnet.com
Sat May 7 02:06:55 CEST 2005


On May 6, 2005, at 7:45 PM, Martin v. Löwis wrote:

> Nicholas Bastin wrote:
>> Because the encoding of that buffer appears to be different depending 
>> on
>> the configure options.
>
> What makes it appear so? sizeof(Py_UNICODE) changes when you change
> the option - does that, in your mind, mean that the encoding changes?

Yes.  Not only in my mind, but in the Python source code.  If 
Py_UNICODE is 4 bytes wide, then the encoding is UTF-32 (UCS-4), 
otherwise the encoding is UTF-16 (*not* UCS-2).

>> If that isn't true, then someone needs to change
>> the doc, and the configure options.  Right now, it seems *very* clear
>> that Py_UNICODE may either be UCS-2 or UCS-4 encoded if you read the
>> configure help, and you can't use the buffer directly if the encoding 
>> is
>> variable.  However, you seem to be saying that this isn't true.
>
> It's a compile-time option (as all configure options). So at run-time,
> it isn't variable.

What I mean by 'variable' is that you can't make any assumption as to 
what the size will be in any given python when you're writing (and 
building) an extension module.  This breaks binary compatibility of 
extensions modules on the same platform and same version of python 
across interpreters which may have been built with different configure 
options.

--
Nick



More information about the Python-Dev mailing list