[Python-ideas] Why don't CPython strings implement slicing using a view?
Andrew Barnert
abarnert at yahoo.com
Sat May 9 09:31:54 CEST 2015
On May 8, 2015, at 21:04, Nikolaus Rath <Nikolaus at rath.org> wrote:
>
>> On May 07 2015, Steven D'Aprano <steve-iDnA/YwAAsAk+I/owrrOrA at public.gmane.org> wrote:
>> But a view would be harmful in this situation:
>>
>> s = "some string"*1000000
>> t = s[1:2] # a view maskerading as a new string
>> del s
>>
>> Now we keep the entire string alive long after it is needed.
>>
>> How would you solve the first problem without introducing the second?
>
> Keep track of the reference count of the underlying string, and if it
> goes down to one, turn the view into a copy and remove the sliced
> original?
It sounds like we're talking about an optimization that, although it could have a big benefit in some not too rare cases, could also have a non-negligible cost in incredibly common cases people use every day.
For example, today, "line = line.rstrip()" makes a copy of most of the original string, then discards the original string. With this change, the same line of code builds a view referencing most of line, then gets to some not-quite-a-weakref-destructor, which makes the copy and discards the original string and the view we just built. If line were huge, the small extra alloc and dealloc and refcheck might be unnoticeable noise, but if line is about 70 chars, as it usually will be, I'd expect a much more noticeable difference. And this is exactly the kind of thing you do in a loop 5 million times in a row in Python.
Of course I could be wrong; we won't really know until someone actually builds at least an implementation and tests it.
More information about the Python-ideas
mailing list