[Python-Dev] speeding up PyObject_GetItem

Daniel Stutzbach daniel at stutzbachenterprises.com
Mon Mar 23 17:23:43 CET 2009


In Python 2.5, list objects were special-cased to skip PyObject_GetItem and
go straight to PyList_GET_ITEM.  That special case gave made other sequences
20% slower than lists for getitem operations.  The special case was removed
in Python 3 (haven't checked 2.6).

Today I was tracing through how PyObject_GetItem works, trying to figure it
why it's so much slower, to see if we can get back some of that speed
without special-casing just one type.  Below is a rough sketch of what
PyObject_GetItem does for the common case of a sequence with a valid
positive integer parameter, and my observations on where there is room for
improvement.  I'm willing to write patches and test them, but I wanted to
get some general feedback first.  After all, it may be the way it is for A
Very Good Reason that I'm not aware of. ;-)

The code paths for PyObject_SetItem and PyObject_DelItem are essentially the
same, so any speed improvments to one could easily be applied to the other
two operations.

PyObject_GetItem ob i
    ob==NULL                      # can't be null anyway
    i==NULL                         # can't be null anyway
    is mapping? ob
    is sequence? ob
    PyIndex_Check i
        tp_as_number i
        tp_flags i
        tp_as_number->nb_index i
    PyNumber_AsSsize_t i
        PyNumber_Index i
            i==NULL                 # still can't be null
            PyLong_Check i
        i==NULL                   # still can't be null
        PyLong_Check i                 # redundant
        => v
    v == -1
    PySequence_GetItem ob v
        ob==NULL                  # still can't be null
        is sequence? ob                # redundant
        sq_item? ob
        sq_item ob v

I think there are two avenues for speed improvement, both of which involve
removing redundancy:

1) Assume the index is a PyLong until proven otherwise

The PyIndex_Check in PyObject_GetItem looks pretty useless.  If it returns
false, PyObject_GetItem throws a type error.  If we skipped the
PyIndex_Check when it would have returned false, PyNumber_AsSsize_t would
later call PyIndex_Check and throw a type error.  Unless I'm missing
something, this is a clear win and makes the code simpler.

In PyNumber_AsSsize_t, we could speed up the common case by trying
PyLong_Check first, and if it fails then calling PyNumber_Index.  This
change would make the common case faster, but make the code a few lines
longer.

2) Remove some of the exessive checking for NULL unless Py_DEBUG is defined

Many of the routines in abstract.c check their parameters for NULL, as a
sanity check, and throw a SystemError if NULL is detected.  I'm tempted to
suggest making these checks only when Py_DEBUG is defined, but I suspect if
you wanted it that way, you would have changed it already. ;)

Assuming you really want the NULL checks in production Python, there are 5
checks for NULL even though there are only two parameters.  Seems like
overkill?

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090323/25ab8055/attachment-0001.htm>


More information about the Python-Dev mailing list