[Python-Dev] [Python-checkins] cpython: Document requierements of Unicode kinds

Terry Reedy tjreedy at udel.edu
Wed Oct 5 21:25:22 CEST 2011



On 10/5/2011 1:43 PM, victor.stinner wrote:
> http://hg.python.org/cpython/rev/055174308822
> changeset:   72699:055174308822
> user:        Victor Stinner<victor.stinner at haypocalc.com>
> date:        Wed Oct 05 01:31:05 2011 +0200
> summary:
>    Document requierements of Unicode kinds
>
> files:
>    Include/unicodeobject.h |  24 ++++++++++++++++++++----
>    1 files changed, 20 insertions(+), 4 deletions(-)
>
>
> diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h
> --- a/Include/unicodeobject.h
> +++ b/Include/unicodeobject.h
> @@ -288,10 +288,26 @@
>           unsigned int interned:2;
>           /* Character size:
>
> -           PyUnicode_WCHAR_KIND (0): wchar_t*
> -           PyUnicode_1BYTE_KIND (1): Py_UCS1*
> -           PyUnicode_2BYTE_KIND (2): Py_UCS2*
> -           PyUnicode_4BYTE_KIND (3): Py_UCS4*
> +           - PyUnicode_WCHAR_KIND (0):
> +
> +             * character type = wchar_t (16 or 32 bits, depending on the
> +               platform)
> +
> +           - PyUnicode_1BYTE_KIND (1):
> +
> +             * character type = Py_UCS1 (8 bits, unsigned)
> +             * if ascii is 1, at least one character must be in range
> +               U+80-U+FF, otherwise all characters must be in range U+00-U+7F

Given that 1==True, this looks backwards.

> +
> +           - PyUnicode_2BYTE_KIND (2):
> +
> +             * character type = Py_UCS2 (16 bits, unsigned)
> +             * at least one character must be in range U+0100-U+1FFFF

/U+1FFFF/U+FFFF/ ?

Terry


More information about the Python-Dev mailing list