[Python-3000] UTF-16

Sat Sep 16 03:08:06 CEST 2006

On 2006-09-01, Paul Prescod wrote:

> I cannot understand why a user should be forced to choose between 16 and 32
> bit strings AT BUILD TIME.

I strongly agree. This has been troublesome for many, not just people 
trying to install binary libs, but also Python code that does actually 
need to know the difference between unicode and wide-unicode characters.

Ideally, implementation work notwithstanding, I would *love* to be able 
to have both types at a literal level (as unicode subclasses), along 
with retained byte string literals.

     ucs2string= u'\U00010000'  # 2 chars, \ud800\udc00
     ucs4string= w'\U00010000'  # 1 char
     bytestring= b'abc'
     string= 'abc'              # byte in 2.x, ucs2 in 3.0

If these were all subclasses of basestring, and other string type 
subclasses could be defined taking advantage of basic string methods, 
that could also allow the CSI stuff you posted Matz's mention of. 
Although I'm personally not at all a fan of non-Unicode string types and 
would rather die than put i-mode emoji in a character set :-)

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/