Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

15 Sep 2011

      On Fri, Sep 16, 2011 at 7:39 AM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
...
Thinking about this, the following may work:
- ASCIIObject: state, length, hash, wstr*, data follow
- SingleBlockUnicode: ASCIIObject, wstr_len,
                     utf8*, utf8_len, data follow
- UnicodeObject: SingleBlockUnicode, data pointer, no data follow
This is essentially your proposal, except that the wstr_len is dropped for
ASCII strings, and that it uses nested structs.
The single-block variants would always be "ready", the full unicode object
is ready only if the data pointer is set.
In your "UnicodeObject" here, is the 'data pointer' the
any/latin1/ucs2/ucs4 union from the original structure definition?

Also, what are the constraints on the "SingleBlockUnicode"? Does it
only hold strings that can be represented in latin1? Or can the size
of the individual elements be more than 1 byte?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia