[Python-Dev] speeding up list append calls

Wed Sep 14 19:34:09 CEST 2005

At 06:55 PM 9/14/2005 +0200, Martin v. Löwis wrote:
>Neal Norwitz wrote:
> > This code doesn't really work in general.  It assumes that any append
> > function call is a list method, which is obviously invalid.  But if a
> > variable is known to be a list (ie, local and assigned as list
> > (BUILD_LIST) or a list comprehension), could we do something like this
> > as a peephole optimization?
>
>Alternatively, couldn't LIST_APPEND check that this really is a list,
>and, if it isn't, fall back to PyObject_CallMethod?

That's an interesting idea - the opcodes for some math operators check if 
the operands are integers and then have a fast path, so this would sort of 
be the reverse.

>Not sure which .append call is performed most frequently, but the
>traditional trick of caching x.append in a local variable might give
>you most of the speedup.

Maybe the VM could actually do that caching for you, if we had a 
CALL_METHOD opcode that cached the attribute in a local variable if it was 
a C method or Python instance method, and reused the cached attribute as 
long as the target object is of the same type/class as the last invocation 
at that point.  I think this is called a "polymorphic inline cache", 
although it doesn't seem very polymorphic if you're only caching one type.  :)

The downside, alas, would be that modifying the class, or using dynamically 
generated methods would break this, unless there was some sort of "version 
counter" on the class that could also be checked, and the caching mechanism 
only cached in the first place if the attribute was found in a class 
dictionary.  Or perhaps the same mechanism that notifies subclasses of 
changes to the class could be used to notify frame objects of cache 
invalidation.

The interesting question, of course, is whether all the extra complexity 
would be worth it.  I think other polymorphic inline caches actually cache 
per call site, not per frame invocation, so it might be that the code 
object would actually be the place to cache this, allowing the program as a 
whole to gain.