[Python-ideas] Why CPython is still behind in performance for some widely used patterns ?

Pau Freixes pfreixes at gmail.com
Fri Jan 26 16:35:46 EST 2018


This mail is the consequence of a true story, a story where CPython
got defeated by Javascript, Java, C# and Go.

One of the teams of the company where Im working had a kind of
benchmark to compare the different languages on top of their
respective "official" web servers such as Node.js, Aiohttp, Dropwizard
and so on.  The test by itself was pretty simple and tried to test the
happy path of the logic, a piece of code that fetches N rules from
another system and then apply them to X whatevers also fetched from
another system, something like that

def filter(rule, whatever):
    if rule.x in whatever.x:
        return True

rules = get_rules()
whatevers = get_whatevers()
for rule in rules:
    for whatever in whatevers:
        if filter(rule, whatever):
            cnt = cnt + 1

return cnt

The performance of Python compared with the other languages was almost
x10 times slower. It's true that they didn't optimize the code, but
they did not for any language having for all of them the same cost in
terms of iterations.

Once I saw the code I proposed a pair of changes, remove the call to
the filter function making it "inline" and caching the rule's
attributes, something like that

for rule in rules:
    x = rule.x
    for whatever in whatevers:
        if x in whatever.x:
            cnt += 1

The performance of the CPython boosted x3/x4 just doing these "silly" things.

The case of the rule cache IMHO is very striking, we have plenty
examples in many repositories where the caching of none local
variables is a widely used pattern, why hasn't been considered a way
to do it implicitly and by default?

The case of the slowness to call functions in CPython is quite
recurrent and looks like its an unsolved problem at all.

Sure I'm missing many things, and I do not have all of the
information. This mail wants to get all of this information that might
help me to understand why we are here - CPython - regarding this two
slow patterns.

This could be considered an unimportant thing, but its more relevant
than someone could expect, at least IMHO. If the default code that you
can write in a language is by default slow and exists an alternative
to make it faster, this language is doing something wrong.

BTW: pypy looks like is immunized [1]

[1] https://gist.github.com/pfreixes/d60d00761093c3bdaf29da025a004582

More information about the Python-ideas mailing list