[pypy-dev] RFC: Copy-on-write list slices

Armin Rigo arigo at tunes.org
Fri Jan 23 15:29:33 CET 2015


Hi Mike,

On 20 January 2015 at 05:26, Mike Kaplinskiy <mike.kaplinskiy at gmail.com> wrote:
> https://bitbucket.org/mikekap/pypy/commits/b774ae0be11b2012852a175f4bae44841343f067
> has an implementation of list slicing that copies the data on write. (The
> third idea from http://doc.pypy.org/en/latest/project-ideas.html .)

One hard bit about implementing this kind of change is making sure you
don't accidentally slow some other kinds of code down.  This requires
running all our benchmarks, at least, as fijal pointed out.

Here are some additional issues to consider.  We have to consider the
overhead in terms of memory usage: it's probably fine if the overhead
is only one pointer per list object (which often points to some fixed
tuple like (0, None, None)).  However, if you do "x = large_list[5:7]"
you keep the whole "large_list" alive as long as "x" is alive.  This
might be an issue for some cases.  Resolving it is possible but more
involved.  It would probably require GC support --- i.e. we can't
really solve this nicely in regular RPython as it is now.  The details
need discussion, but I can think for example about a way to tell the
GC "this is a pointer to the full list, but I'm only going to access
this range of items, so ignore the rest".  Another way would be to
have some callback that copies just the items needed out of the large
list, but that's full of open questions...


A bientôt,

Armin.


More information about the pypy-dev mailing list