[New-bugs-announce] [issue23553] Reduce the number of comparison for range checking.

Raymond Hettinger report at bugs.python.org
Sun Mar 1 01:50:38 CET 2015

New submission from Raymond Hettinger:

Python's core is full of bound checks like this one in Objects/listobject.c:

static PyObject *
list_item(PyListObject *a, Py_ssize_t i)
    if (i < 0 || i >= Py_SIZE(a)) {

Abner Fog's high-level language optimization guide,  http://www.agner.org/optimize/optimizing_cpp.pdf in section 14.2 Bounds Checking, shows a way to fold this into a single check:

-    if (i < 0 || i >= Py_SIZE(a)) {
+    if ((unsigned)i >= (unsigned)(Py_SIZE(a))) {
         if (indexerr == NULL) {
             indexerr = PyUnicode_FromString(
                 "list index out of range");

The old generated assembly code looks like this:

    subq    $8, %rsp
    testq   %rsi, %rsi
    js  L227
    cmpq    16(%rdi), %rsi
    jl  L228
    ... <error reporting and exit > ...
    movq    24(%rdi), %rax
    movq    (%rax,%rsi,8), %rax
    addq    $1, (%rax)
    addq    $8, %rsp

The new disassembly looks like this:

    cmpl    %esi, 16(%rdi)
    ja  L227
    ... <error reporting and exit > ...
    movq    24(%rdi), %rax
    movq    (%rax,%rsi,8), %rax
    addq    $1, (%rax)

Note, the new code not only saves a comparison/conditional-jump pair, it also avoids the need to adjust %rsp on the way in and the way out for a net savings of four instructions along the critical path.

When we have good branch prediction, the current approach is very low cost; however, Abner Fog's recommendation is never more expensive, is sometimes cheaper, saves a possible misprediction, and reduces the total code generated.  All in all, it is a net win.

I recommend we put in a macro of some sort so that this optimization gets expressed exactly once in the code and so that it has a good clear name with an explanation of what it does.

components: Interpreter Core
messages: 236928
nosy: rhettinger
priority: normal
severity: normal
status: open
title: Reduce the number of comparison for range checking.
type: performance
versions: Python 3.5

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list