[Python-Dev] speeding up list append calls
Phillip J. Eby
pje at telecommunity.com
Wed Sep 14 19:34:09 CEST 2005
At 06:55 PM 9/14/2005 +0200, Martin v. Löwis wrote:
>Neal Norwitz wrote:
> > This code doesn't really work in general. It assumes that any append
> > function call is a list method, which is obviously invalid. But if a
> > variable is known to be a list (ie, local and assigned as list
> > (BUILD_LIST) or a list comprehension), could we do something like this
> > as a peephole optimization?
>Alternatively, couldn't LIST_APPEND check that this really is a list,
>and, if it isn't, fall back to PyObject_CallMethod?
That's an interesting idea - the opcodes for some math operators check if
the operands are integers and then have a fast path, so this would sort of
be the reverse.
>Not sure which .append call is performed most frequently, but the
>traditional trick of caching x.append in a local variable might give
>you most of the speedup.
Maybe the VM could actually do that caching for you, if we had a
CALL_METHOD opcode that cached the attribute in a local variable if it was
a C method or Python instance method, and reused the cached attribute as
long as the target object is of the same type/class as the last invocation
at that point. I think this is called a "polymorphic inline cache",
although it doesn't seem very polymorphic if you're only caching one type. :)
The downside, alas, would be that modifying the class, or using dynamically
generated methods would break this, unless there was some sort of "version
counter" on the class that could also be checked, and the caching mechanism
only cached in the first place if the attribute was found in a class
dictionary. Or perhaps the same mechanism that notifies subclasses of
changes to the class could be used to notify frame objects of cache
The interesting question, of course, is whether all the extra complexity
would be worth it. I think other polymorphic inline caches actually cache
per call site, not per frame invocation, so it might be that the code
object would actually be the place to cache this, allowing the program as a
whole to gain.
More information about the Python-Dev