[Python-Dev] PEP 393 close to pronouncement

Wed Sep 28 19:15:24 CEST 2011

2011/9/28 M.-A. Lemburg <mal at egenix.com>:
> Guido van Rossum wrote:
>> Given the feedback so far, I am happy to pronounce PEP 393 as
>> accepted. Martin, congratulations! Go ahead and mark ity as Accepted.
>> (But please do fix up the small nits that Victor reported in his
>> earlier message.)
>
> I've been working on feedback for the last few days, but I guess it's
> too late. Here goes anyway...
>
> I've only read the PEP and not followed the discussion due to lack of
> time, so if any of this is no longer valid, that's probably because
> the PEP wasn't updated :-)
>
> Resizing
> --------
>
> Codecs use resizing a lot. Given that PyCompactUnicodeObject
> does not support resizing, most decoders will have to use
> PyUnicodeObject and thus not benefit from the memory footprint
> advantages of e.g. PyASCIIObject.
>
>
> Data structure
> --------------
>
> The data structure description in the PEP appears to be wrong:
>
> PyASCIIObject has a wchar_t *wstr pointer - I guess this should
> be a char *str pointer, otherwise, where's the memory footprint
> advantage (esp. on Linux where sizeof(wchar_t) == 4) ?
>
> I also don't see a reason to limit the UCS1 storage version
> to ASCII. Accordingly, the object should be called PyLatin1Object
> or PyUCS1Object.

I think the purpose is that if it's only ASCII, no work is need to
encode to UTF-8.

-- 
Regards,
Benjamin