[Python-3000] UTF-16
Andrew Clover
and at doxdesk.com
Sat Sep 16 03:08:06 CEST 2006
On 2006-09-01, Paul Prescod wrote:
> I cannot understand why a user should be forced to choose between 16 and 32
> bit strings AT BUILD TIME.
I strongly agree. This has been troublesome for many, not just people
trying to install binary libs, but also Python code that does actually
need to know the difference between unicode and wide-unicode characters.
Ideally, implementation work notwithstanding, I would *love* to be able
to have both types at a literal level (as unicode subclasses), along
with retained byte string literals.
ucs2string= u'\U00010000' # 2 chars, \ud800\udc00
ucs4string= w'\U00010000' # 1 char
bytestring= b'abc'
string= 'abc' # byte in 2.x, ucs2 in 3.0
If these were all subclasses of basestring, and other string type
subclasses could be defined taking advantage of basic string methods,
that could also allow the CSI stuff you posted Matz's mention of.
Although I'm personally not at all a fan of non-Unicode string types and
would rather die than put i-mode emoji in a character set :-)
--
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/
More information about the Python-3000
mailing list