[Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Tue Mar 7 19:44:36 EST 2017

On Tue, Mar 07, 2017 at 09:10:00PM +0000, Elliot Gorokhovsky wrote:

> I think the bigger problem, though, is that most list use does *not*
> involve sorting, 

There are other uses for a type hint apart from sorting. There may be 
optimized versions of other functions (sum, math.fsum, ...) and list 
methods (e.g. count, remove), anything that has to walk the list 
comparing items).

All very hypothetical at the moment, I admit it...

> so it would be a shame to impose the non-trivial overhead
> of type-checking on *all* list use.

You might be right, but my guess is that the overhead isn't quite as big 
as you may think, at least for some operations. The type-check itself is 
just a pointer compare, not an issubclass() operation. The types are 
either identical, or the list is hetrogeneous.

True, `mylist[i] = x` should be a quick assignment into an array of 
pointers, but there's a bounds check to ensure i is within the correct 
range. Other methods are even more expensive: `mylist[i:j] = [a...z]` 
has to move list items around, possibly allocate more memory, and update 
the list length, compared to that the scan of a...z is probably minimal. 
`mylist.append(x)` is *usually* a simple assignment into the array (plus 
updating the list length) but every now and again it triggers a resize, 
which is costly.

I don't think we can predict whether this is a nett gain or loss just 
from first principles.

> Anyway, my patch could always be a precursor to a more general optimization
> along these lines.

Indeed! Even if nothing else comes of this than your patch, thank you!

-- 
Steve