[Python-Dev] len(chr(i)) = 2?
R. David Murray
rdmurray at bitdance.com
Mon Nov 22 18:30:29 CET 2010
On Mon, 22 Nov 2010 12:00:14 -0500, Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:
> I recently updated chr() and ord() documentation and used
> "narrow/wide" terms. I thought USC2/4 proponents objected to that on
> the basis that these terms are imprecise.
For reference, a grep in py3k/Doc reveals that there are currently exactly
23 lines mentioning UCS2 or UCS4 in the docs. Most are in the unicode part
of the c-api, and 6 are in what's new for 2.2:
c-api/arg.rst: Convert a null-terminated buffer of Unicode (UCS-2 or UCS-4) data to a Python
c-api/arg.rst: Convert a Unicode (UCS-2 or UCS-4) data buffer and its length to a Python
c-api/unicode.rst: for :c:type:`Py_UNICODE` and store Unicode values internally as UCS2. It is also
c-api/unicode.rst: possible to build a UCS4 version of Python (most recent Linux distributions come
c-api/unicode.rst: with UCS4 builds of Python). These builds then use a 32-bit type for
c-api/unicode.rst: :c:type:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms
c-api/unicode.rst: short` (UCS2) or :c:type:`unsigned long` (UCS4).
c-api/unicode.rst:Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep
c-api/unicode.rst: values is interpreted as an UCS-2 character.
whatsnew/2.2.rst:usually stored as UCS-2, as 16-bit unsigned integers. Python 2.2 can also be
whatsnew/2.2.rst:compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by
whatsnew/2.2.rst:supplying :option:`--enable-unicode=ucs4` to the configure script. (It's also
whatsnew/2.2.rst:When built to use UCS-4 (a "wide Python"), the interpreter can natively handle
whatsnew/2.2.rst:compiled to use UCS-2 (a "narrow Python"), values greater than 65535 will still
whatsnew/2.2.rst:Marc-André Lemburg. The changes to support using UCS-4 internally were
howto/unicode.rst:.. comment Additional topic: building Python w/ UCS2 or UCS4 support
howto/unicode.rst: - [ ] Building Python (UCS2, UCS4)
library/sys.rst: characters are stored as UCS-2 or UCS-4.
library/json.rst: specified. Encodings that are not ASCII based (such as UCS-2) are not
faq/extending.rst:When importing module X, why do I get "undefined symbol: PyUnicodeUCS2*"?
faq/extending.rst:If instead the name of the undefined symbol starts with ``PyUnicodeUCS4``, the
faq/extending.rst: ... print('UCS4 build')
faq/extending.rst: ... print('UCS2 build')
--
R. David Murray www.bitdance.com
More information about the Python-Dev
mailing list