Efficient way to break up a list into two pieces
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sat Feb 20 20:30:05 EST 2010
On Sat, 20 Feb 2010 16:55:10 -0800, marwie wrote:
> Hello,
>
> I recently read about augmented assignments and that (with l1, l2 being
> lists)
>
> l1.extend(l2)
>
> is more efficient than
>
> l1 = l1 + l2
>
> because unnecessary copy operations can be avoided. Now my question is
> if there's a similar thing for breaking a list into two parts. Let's say
> I want to remove from l1 everything from and including position 10 and
> store it in l2. Then I can write
>
> l2 = l1[10:]
> del l1[10:]
>
> But since I'm assigning a slice the elements will be copied.
Yes, list slicing can't make any sort of assumptions about what you're
going to do next, so when you take a slice, it has to copy the data.
> Basically,
> I'm looking for something like l1.pop(10,len(l1)) which returns and
> removes a whole chunk of data. Is there such a thing (and if not, why
> not?)
Such a list pop would have to copy the data too. How else could it work?
What exactly is the problem you are trying to solve?
If you are unhappy that copy-and-remove a slice from a list takes two
lines instead of one ("won't somebody think of the shortage of
newlines!!!" *wink*), then just write a helper function and call that:
def copy_and_remove(alist, slice):
tmp = alist[slice]
del alist[slice]
return tmp
l2 = copy_and_remove(l1, slice(10, None))
If you are unhappy that copying slices from lists copies data, well
that's just the way things are. Don't make the mistake of trying to
optimize your application before you actually know what bits need
optimizing.
Python lists are arrays of pointers to objects, so copying a slice is
fast: it doesn't have to copy the objects, just pointers. Deleting from
the end of the list is also quick, because you don't have to move memory,
just clear some pointers and change the length field.
Splitting such an array without copying data is, essentially, impossible.
Python lists aren't linked lists.
But as I said, most of the time copying slices doesn't matter: the time
taken is unlikely to be a serious factor in practice. If it is (and
you've profiled your code, right? you're not just guessing what you think
is fast and slow?) then there may be ways to optimize your code to avoid
needing to copy slices, but we'd need to know what you're doing with the
data.
--
Steven
More information about the Python-list
mailing list