O(n^2) is bad - can it be fixed?

Isaac To Kar Keung kkto at csis.hku.hk
Sat May 26 05:51:58 CEST 2001

>>>>> "Tim" == Tim Peters <tim.one at home.com> writes:

    >> The problem is that whatever the mapping, there is a possible case in
    >> which copying and reallocation is done basically everytime.

    Tim> We could, e.g., always round the list size up to the next higher
    Tim> power of 2.  That way the capacity is implicit.  We're unlikely to,
    Tim> though.

No, this does not help the situation when the size of the list goes
repeatedly below and above the threshold.

    Tim> Except that it rarely happens at all -- don't overblow this.

Yes, it rarely happens.  But is the solution really so difficult that we
have any excuse not doing it right?

    Tim> The third possiblity is the one you're not ready for yet: Python is
    Tim> neither a C library nor an OS, neither is it obligated to provide
    Tim> theoretically optimal performance in every conceivable case.

I see this as a fundamental design problem: it ties the conceptual list to
physical memory allocation.  The user should have their own right to say
that the list is to step between 397 and 398 elements (or whatever number
they are in mind), and the performance of the library shouldn't be in
completely different orders (e.g., from constant time, linear space to
linear time, quadratic space) in such cases.  This would make Python
completely useless.

    Tim> Python lists are wonderfully zippy and compact as-is, and doing
    Tim> anything that hurts either of those would be a net loss for the
    Tim> vast majority of Python users.

The answer is far from the difficulty you've implied.  Lists normally
contain some hundreds of bytes, and adding 4 bytes there is not that much
for anybody.  Extra space allocation to the list itself only happens on on
PyMem_REALLOC, not if you have initially created the list of the right size.
If we ever start doing realloc, it is likely that we continue reallocating
anyway.  And we can have a compilation flag that activate the structure that
don't do geometric scaling by themselves.  I don't see how this can possibly
"hurt the vast majority of Python users".


More information about the Python-list mailing list