[Python-Dev] len(chr(i)) = 2?

Nick Coghlan ncoghlan at gmail.com
Mon Nov 22 16:37:21 CET 2010


On Mon, Nov 22, 2010 at 10:47 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Please also note that we have used the terms UCS-2 and UCS-4 in Python2
> for 9+ years now and users are just starting to learn the difference
> and get acquainted with the fact that Python uses these two forms.
>
> Confronting them with "narrow" and "wide" builds is only
> going to cause more confusion, not less, and adding those
> strings to Python package files isn't going to help much either,
> since the terms don't convey any relationship to Unicode:

I was personally surprised to learn in this discussion that there had
even been an *attempt* to change the names of the two build variants
to anything other than UCS2/UCS4. The concrete API implementations
certainly still use those two terms to prevent inadvertent linkage
with the wrong version of the C API.

For practical purposes, UCS2/UCS4 convey far more inherent information
than narrow/wide:
- many developers will recognise them as Unicode related, even if they
don't know exactly what they mean
- even those that don't recognise them, can soon learn that they're
Unicode related just by plugging them into Google*
- a bit more digging should reveal that they're Unicode storage
formats closely related to the UTF-16 and UTF-32 transfer encodings
respectively*

*(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on
Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article)

All that just armed with Google, without even looking at the Python
docs specifically.

So don't just think about "what will developers know?", also think
about "what will developers know, and what will a quick trip to a
search engine tell them?". And once you take that stance, the overly
generic narrow/wide terms fail, badly.

+1 for MAL's suggested tweaks to the Py3k configure options.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list