[Python-3000] Lazy strings (was Re: Py3k release schedule worries)

Talin talin at acm.org
Sun Dec 31 04:47:27 CET 2006


Guido van Rossum wrote:
> On 12/30/06, Larry Hastings <larry at hastings.org> wrote:
>>
>>  On Tue, Dec 19, 2006, Guido van Rossum wrote:
>>
>> On 12/19/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
>>
>> (I haven't abandoned this, but it hasn't been a top priority; partially
>> because Larry Hastings work on smarter concatenation has showed that "lazy
>> evaluation" can work in today's Python, and partially due to the
>> schedule/discussion issues you write about.) Now's the time for that work to
>> come out of the closet.
>>  At last, I can take a crack at this, now that Christmas is over.  Just
>> double-checking:
>>
>>
>> I know you want it for what are currently "unicode strings"
>> (Objects/unicodeobject.c).  Do you also want it for single-byte strings
>> (Objects/stringobject.c)?
> 
> No, the 8-bit strings will die in Py3k anyway.
> 
>> I should generate my diffs against
>> http://svn.python.org/projects/python/branches/p3yk ?
> 
> Yes, please.
> 
> Thanks for doing this!

Maybe this would be a good time to review, or at least restate, the 
specific plans for strings in Py3K? I know that there's been a great 
deal of discussion on this, but a lot of that discussion took place 
*before* Larry's work (specifically, before a number of people in this 
group drank the lazy-strings KoolAid.)

I'm specifically concerned about avoiding confusion over the "lazy" 
aspect of strings, because there's two kinds of "laziness" that has been 
discussed here: Lazy string manipulation (slice and join), and lazy 
format conversion (8-bit, 16-bit, 32-bit.) Both are, I think, desirable. 
They are also inter-related, in that the design of one likely affects 
the other, so I don't think it makes sense to discuss these issues in 
isolation.

Is there a PEP which defines what is going to happen? I specifically 
refer to issues of:

    -- Internal representation of varying-width string encodings
    -- On-the-fly encoding changes
    -- C API changes
    -- String 'views'
    -- Lazy slicing and concatenation.
    -- Performance expectations for all of the above.

-- Talin


More information about the Python-3000 mailing list