[I18n-sig] How does Python Unicode treat surrogates?
Tom Emerson
tree@basistech.com
Mon, 25 Jun 2001 15:17:57 -0400
Fredrik Lundh writes:
> I'm sceptical -- I see very little reason to maintain that distinction.
> let's use either UCS-2 or UCS-4 for the internal storage, stick to the
> "character strings are character sequences" concept, and keep the
> UTF-16 surrogate issue where it belongs: in the codecs.
How then is u"\U00200000" represented internally if you use UCS-2 as
the internal storage representation?
--
Tom Emerson Basis Technology Corp.
Sr. Sinostringologist http://www.basistech.com
"Beware the lollipop of mediocrity: lick it once and you suck forever"