[Python-ideas] Why CPython is still behind in performance for some widely used patterns ?

Chris Barker chris.barker at noaa.gov
Fri Jan 26 17:18:53 EST 2018


If there are robust and simple optimizations that can be added to CPython,
great, but:

This mail is the consequence of a true story, a story where CPython
> got defeated by Javascript, Java, C# and Go.
>

at least those last three are statically compiled languages -- they are
going to be faster than Python for this sort of thing -- particularly for
code written in a non-pythonic style...

def filter(rule, whatever):
>     if rule.x in whatever.x:
>         return True
>
> rules = get_rules()
> whatevers = get_whatevers()
> for rule in rules:
>     for whatever in whatevers:
>         if filter(rule, whatever):
>             cnt = cnt + 1
>
> return cnt
>
>  It's true that they didn't optimize the code, but
> they did not for any language having for all of them the same cost in
> terms of iterations.
>

sure, but I would argue that you do need to write code in a clean style
appropriate for the language at hand.

For instance, the above creates a function that is a simple one-liner --
there is no reason to do that, and the fact that function calls to have
significant overhead in Python is going to bite you.

for rule in rules:
>     x = rule.x
>     for whatever in whatevers:
>         if x in whatever.x:
>             cnt += 1
>
> The performance of the CPython boosted x3/x4 just doing these "silly"
> things.
>

"inlining" the filter call is making the code more pythonic and readable --
a no brainer. I wouldn't call that a optimization.

making rule.x local is an optimization -- that is, the only reason you'd do
it to to make the code go faster. how much difference did that really make?

I also don't know what type your "whatevers" are, but "x in something" can
be order (n) if they re sequences, and using a dict or set would be a much
better performance.

and perhaps collections.Counter would help here, too.

In short, it is a non-goal to get python to run as fast as static langues
for simple nested loop code like this :-)

The case of the rule cache IMHO is very striking, we have plenty
> examples in many repositories where the caching of none local
> variables is a widely used pattern, why hasn't been considered a way
> to do it implicitly and by default?
>

you can bet it's been considered -- the Python core devs are a pretty smart
bunch :-)

The fundamental reason is that rule.x could change inside that loop -- so
you can't cache it unless you know for sure it won't. -- Again, dynamic
language.

The case of the slowness to call functions in CPython is quite
> recurrent and looks like its an unsolved problem at all.
>

dynamic language again ...

 If the default code that you
> can write in a language is by default slow and exists an alternative
> to make it faster, this language is doing something wrong.
>

yes, that's true -- but your example shouldn't be the default code you
write in Python.

BTW: pypy looks like is immunized [1]
>
> [1] https://gist.github.com/pfreixes/d60d00761093c3bdaf29da025a004582


PyPy uses a JIT -- which is the way to make a dynamic language run faster
-- That's kind of why it exists....

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180126/8347e36b/attachment-0001.html>


More information about the Python-ideas mailing list