[I18n-sig] Re: Unicode surrogates: just say no!
François Pinard
pinard@iro.umontreal.ca
02 Jul 2001 15:05:35 -0400
[Guido van Rossum]
> When using UCS-4 mode, I was in favor of allowing unichr() and \U to
> specify any value in range(0x100000000L)
I did not check recently, but would think Unicode and 10646 are defined
on 31 bits, not 32. If you represent an UCS-4 code within a 32 bit int,
it will never be negative. It might be useful to rely on this.
P.S. - Would not 32 bits also require one more byte in UTF-8?
--
François Pinard http://www.iro.umontreal.ca/~pinard