[Python-Dev] New Py_UNICODE doc
Nicholas Bastin
nbastin at opnet.com
Sun May 8 20:44:00 CEST 2005
On May 8, 2005, at 5:28 AM, Martin v. Löwis wrote:
> Nicholas Bastin wrote:
>> All of my proposals for what to change the documention to have been
>> shot down by Martin. If someone has better verbiage that they'd like
>> to see, I'd be perfectly happy to patch the doc.
>
> I don't look into the specific wording - you speak English much better
> than I do. What I care about is that this part of the documentation
> should be complete and precise. I.e. statements like "should not make
> assumptions" might be fine, as long as they are still followed by
> a precise description of what the code currently does. So it should
> mention that the representation can be either 2 or 4 bytes, that
> the strings "ucs2" and "ucs4" can be used to select one of them,
> that it is always 2 bytes on Windows, that 2 bytes means that non-BMP
> characters can be represented as surrogate pairs, and so on.
It's not always 2 bytes on Windows. Users can alter the config options
(and not unreasonably so, btw, on 64-bit windows platforms).
This goes to the issue that I think people don't understand that we
have to assume that some users will build their own Python. This will
result in 2-byte Python's on RHL9, and 4-byte python's on windows, both
of which have already been claimed in this discussion to not happen,
which is untrue. You can't build a binary extension module on windows
and assume that Py_UNICODE is 2 bytes, because that's not enforced in
any way. The same is true for 4-byte Py_UNICODE on RHL9.
--
Nick
More information about the Python-Dev
mailing list