[Python-Dev] RE: [Patches] [Patch #100745] Fix PR #384, fixes UTF-8 en/decode

Bill Tutt rassilon@list.org
Thu, 6 Jul 2000 16:28:02 -0700 (PDT)


On Thu, 6 Jul 2000, M.-A. Lemburg wrote:

> Bill Tutt wrote:
> > 
> > On Thu, 6 Jul 2000, Guido van Rossum wrote:
> > 
> > > > In any event, having the typedef is still useful since it clarifies the
> > > > meaning behind the code.
> > >
> > 
> > How about this:
> > /*
> >  * Use this typedef when you need to represent a UTF-16 surrogate pair
> >  * as single unsigned integer.
> >  */
> > #if SIZEOF_INT >= 4
> > typedef unsigned int Py_UCS4;
> > #else
> > #if SIZEOF_LONG >= 4
> > typedef unsigned long Py_UCS4;
> > #else
> > #error "can't find integral type that can contain 32 bits"
> > #endif /* SIZEOF_LONG */
> > #endif /* SIZEOF_INT */
> 
> I like the name... Py_UCS4 is indeed what we're talking about
> here.
> 
> What I don't understand is why you raise a compile error; AFAIK,
> unsigned long is at least 32 bits on all platforms and that's
> what the Unicode implementation would need to support UCS4 -- more
> bits don't do any harm since the storage type is fixed at
> 16-bit UTF-16 values.
> 

Loud and obnoxious failures are much better than silent, and hard to debug
failures. :) If unsigned long is big enough on all of the platforms we
care about, then they won't ever see the compile error. :)

Bill