[Python-Dev] UCS2/UCS4 default

Guido van Rossum guido at python.org
Wed Jul 2 19:42:13 CEST 2008


On Wed, Jul 2, 2008 at 10:19 AM, Jeroen Ruigrok van der Werven
<asmodai at in-nomine.org> wrote:
> -On [20080702 19:08], Guido van Rossum (guido at python.org) wrote:
>>I think we should continue to leave this up to the distribution. AFAIK
>>many Linux distros already use UCS4 for everything anyway.
>
> FreeBSD's ports makes it a configure option.
>
>>For that reason I think it's also better that the configure script
>>continues to default to UTF-16 -- this will give the UTF-16 support
>>code the necessary exercise. (It is mostly a superset of the UCS-4
>>support code, so I'm less worried about the latter getting enough
>>exercise.)
>
> I was under the impression that it was still UCS2 and thus limiting things
> to the BMP only. So you are saying it's UTF-16 nowadays? For both 2.6 and
> 3.0?

Yes. At least in the sense that \Uxxxxxxxx gets translated to a
surrogate pair, and that the UTF-8 codec supports surrogate pairs in
both directions. It's been like this for a long time. What else would
you expect from UTF-16 support?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list