[Python-Dev] speeding up PyObject_GetItem
Daniel Stutzbach
daniel at stutzbachenterprises.com
Mon Mar 23 17:23:43 CET 2009
In Python 2.5, list objects were special-cased to skip PyObject_GetItem and
go straight to PyList_GET_ITEM. That special case gave made other sequences
20% slower than lists for getitem operations. The special case was removed
in Python 3 (haven't checked 2.6).
Today I was tracing through how PyObject_GetItem works, trying to figure it
why it's so much slower, to see if we can get back some of that speed
without special-casing just one type. Below is a rough sketch of what
PyObject_GetItem does for the common case of a sequence with a valid
positive integer parameter, and my observations on where there is room for
improvement. I'm willing to write patches and test them, but I wanted to
get some general feedback first. After all, it may be the way it is for A
Very Good Reason that I'm not aware of. ;-)
The code paths for PyObject_SetItem and PyObject_DelItem are essentially the
same, so any speed improvments to one could easily be applied to the other
two operations.
PyObject_GetItem ob i
ob==NULL # can't be null anyway
i==NULL # can't be null anyway
is mapping? ob
is sequence? ob
PyIndex_Check i
tp_as_number i
tp_flags i
tp_as_number->nb_index i
PyNumber_AsSsize_t i
PyNumber_Index i
i==NULL # still can't be null
PyLong_Check i
i==NULL # still can't be null
PyLong_Check i # redundant
=> v
v == -1
PySequence_GetItem ob v
ob==NULL # still can't be null
is sequence? ob # redundant
sq_item? ob
sq_item ob v
I think there are two avenues for speed improvement, both of which involve
removing redundancy:
1) Assume the index is a PyLong until proven otherwise
The PyIndex_Check in PyObject_GetItem looks pretty useless. If it returns
false, PyObject_GetItem throws a type error. If we skipped the
PyIndex_Check when it would have returned false, PyNumber_AsSsize_t would
later call PyIndex_Check and throw a type error. Unless I'm missing
something, this is a clear win and makes the code simpler.
In PyNumber_AsSsize_t, we could speed up the common case by trying
PyLong_Check first, and if it fails then calling PyNumber_Index. This
change would make the common case faster, but make the code a few lines
longer.
2) Remove some of the exessive checking for NULL unless Py_DEBUG is defined
Many of the routines in abstract.c check their parameters for NULL, as a
sanity check, and throw a SystemError if NULL is detected. I'm tempted to
suggest making these checks only when Py_DEBUG is defined, but I suspect if
you wanted it that way, you would have changed it already. ;)
Assuming you really want the NULL checks in production Python, there are 5
checks for NULL even though there are only two parameters. Seems like
overkill?
--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090323/25ab8055/attachment-0001.htm>
More information about the Python-Dev
mailing list