List Performance

Maric Michaud maric at aristote.info
Mon Jun 30 16:09:55 CEST 2008


Le Monday 30 June 2008 15:52:56 Gerhard Häring, vous avez écrit :
> Larry Bates wrote:
> > [...]
> > So its actually faster to append to a long list than an empty one?  That
> > certainly would not have been intuitively obvious now would it?
>
> Maybe not intuitively, but if you know how dynamically growing data
> structures are implemented, it's plausible. They overallocate, and the
> amount of overallocation is dependent on the current size. Relevant
> source snippet from Python 2.6:
>
>      /* This over-allocates proportional to the list size, making room
>       * for additional growth.  The over-allocation is mild, but is
>       * enough to give linear-time amortized behavior over a long
>       * sequence of appends() in the presence of a poorly-performing
>       * system realloc().
>       * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
>       */
>      new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
>
> If, on the other hand, we knew beforehand how big the list will get
> approximately, we could avoid all these reallocations. No problem with
> Python's C API:
>
> PyAPI_FUNC(PyObject *) PyList_New(Py_ssize_t size);
>
> But you can't do it directly from Python, unless you (ab)use ctypes.
>
> -- Gerhard
>
> --
> http://mail.python.org/mailman/listinfo/python-list

Well, as I posted few days ago, one could envisage, as a pure python 
optimization for dealing with long list, to replace an algorithm with a lot 
of append by something like this :

mark = object()

datas = [ mark ] * expected_size

# working with the datas while maintaining the effective currrently used size

Of course one could even subclass list and redefine __len__, append, and some 
other methods to deal with this "allocated by block" list.


-- 
_____________

Maric Michaud



More information about the Python-list mailing list