Re: [Python-ideas] Why CPython is still behind in performance for some widely used patterns ?

Jan. 27, 2018

      Hi,

Thanks to all of you for your responses, the points of view and the
information that you shared to back up your rationales, I had some
time to visit few of them but sure I will try to suit the proper time
to review all of them.

It's hard to try to keep the discussion organized responding at each
response, if you don't mind I would do it with just this email. If you
believe that I'm missing something important shoot it.

First of all, my fault starting the discussion with the language
battle side, this didn't help it to focus the conversation to the
point that I wanted to discuss. So, the intention was to raise two use
cases which both have a performance cost that could be explicitly
circumvented by the developer, taking into account that both are,
let's say, well known by the community.

Correct me if I'm wrong, but most of you argue that the proper Zen of
Python - can we say it mutability [1]? as Victor pointed out -  that
allow the user have the freedom to mutate objects in runtime goes in
the opposite direction of allowing the *compiler* to make code with
some optimizations. Or, more specifically for the ceval -
*interpreter*? - apply some hacks that would help to reduce the
footprint of some operations.

Im wondering if a solution might pass for having something like that
[2] but for generic attributes, should it be possible? has been
discussed before ? is there any red-flag that you might thing that
will make to much complicated a well-balanced solution?

Regarding the cost of calling a function, that I can guess is not
related with the previous stuff, what is an impediment right now to
make it faster ?

[1] https://faster-cpython.readthedocs.io/mutable.html
[2] https://bugs.python.org/issue28158

On Fri, Jan 26, 2018 at 10:35 PM, Pau Freixes <pfreixes@gmail.com> wrote:
...
Hi,
This mail is the consequence of a true story, a story where CPython
got defeated by Javascript, Java, C# and Go.
One of the teams of the company where Im working had a kind of
benchmark to compare the different languages on top of their
respective "official" web servers such as Node.js, Aiohttp, Dropwizard
and so on.  The test by itself was pretty simple and tried to test the
happy path of the logic, a piece of code that fetches N rules from
another system and then apply them to X whatevers also fetched from
another system, something like that
def filter(rule, whatever):
    if rule.x in whatever.x:
        return True
rules = get_rules()
whatevers = get_whatevers()
for rule in rules:
    for whatever in whatevers:
        if filter(rule, whatever):
            cnt = cnt + 1
return cnt
The performance of Python compared with the other languages was almost
x10 times slower. It's true that they didn't optimize the code, but
they did not for any language having for all of them the same cost in
terms of iterations.
Once I saw the code I proposed a pair of changes, remove the call to
the filter function making it "inline" and caching the rule's
attributes, something like that
for rule in rules:
    x = rule.x
    for whatever in whatevers:
        if x in whatever.x:
            cnt += 1
The performance of the CPython boosted x3/x4 just doing these "silly" things.
The case of the rule cache IMHO is very striking, we have plenty
examples in many repositories where the caching of none local
variables is a widely used pattern, why hasn't been considered a way
to do it implicitly and by default?
The case of the slowness to call functions in CPython is quite
recurrent and looks like its an unsolved problem at all.
Sure I'm missing many things, and I do not have all of the
information. This mail wants to get all of this information that might
help me to understand why we are here - CPython - regarding this two
slow patterns.
This could be considered an unimportant thing, but its more relevant
than someone could expect, at least IMHO. If the default code that you
can write in a language is by default slow and exists an alternative
to make it faster, this language is doing something wrong.
BTW: pypy looks like is immunized [1]
[1] https://gist.github.com/pfreixes/d60d00761093c3bdaf29da025a004582
--
--pau
-- 
--pau