[Python-ideas] Why CPython is still behind in performance for some widely used patterns ?

David Mertz mertz at gnosis.cx
Sat Jan 27 17:53:53 EST 2018


>
> def filter(rule, whatever):
>     if rule.x in whatever.x:
>         return True
>
> rules = get_rules()
> whatevers = get_whatevers()
> for rule in rules:
>     for whatever in whatevers:
>         if filter(rule, whatever):
>             cnt = cnt + 1
>
> return cnt
>

This code seems almost certainly broken as written.  Not just suboptimal,
but just plain wrong.  As a start, it has a return statement outside a
function or method body, so it's not even syntactical (also, `cnt` is never
initialized).  But assuming  we just print out `cnt` it still gives an
answer we probably don't want.

For example, I wrote get_rules() and get_whatevers() functions that produce
a list of "things" that have an x attribute.  Each thing holds a (distinct)
word from a Project Gutenberg book (Title: Animal Locomotion: Or walking,
swimming, and flying, with a dissertation on aëronautics; Author: J. Bell
Pettigrew). Whatevers omit a few of the words, since the code suggests
there should be more rules. In particular:

In [1]: len(get_rules()), len(get_whatevers())

Out [1]: (12306, 12301)


Running the probably wrong code is indeed slow:

In [2]:
%%time
def filter(rule, whatever):
if rule.x in whatever.x:
return True
rules = get_rules()
whatevers = get_whatevers()
cnt = 0
for rule in rules:
for whatever in whatevers:
if filter(rule, whatever):
cnt = cnt + 1
print(cnt)

Out [2]:

110134
CPU times: user 53.1 s, sys: 190 ms, total: 53.3 s
Wall time: 53.6 s


It's hard for me to imagine why this is the question one would want
answered.  It seems much more likely you'd want to know:

In [3]:
%%time
len({thing.x for thing in get_rules()} - {thing.x for thing in
get_whatevers()})

Out [3]:
CPU times: user 104 ms, sys: 4.89 ms, total: 109 ms Wall time: 112 ms
5

So that's 500 times faster, more Pythonic, and seems to actually answer the
question one would want answered.

However, let's suppose there really is a reason to answer the question in
the original code. Using more sensible basic datatypes, we only get about a
3x speedup:

In [4]:

%%time
rules = {thing.x for thing in get_rules()}
whatevers = {thing.x for thing in get_whatevers()}
cnt = 0
for rule in rules:
    for whatever in whatevers:
        if rule in whatever:
            cnt += 1

print(cnt)

Out [4]:

110134

CPU times: user 18.3 s, sys: 96.9 ms, total: 18.4 s

Wall time: 18.5 s


I'm sure there is room for speedup if the actual problem being solved was
better described. Maybe something involving itertools.product(), but it's
hard to say without knowing what the correct behavior actually is.

Overall this is similar to saying you could implement bogosort in Python,
and it would be much slower than calling Timsort with `sorted(my_stuff)`
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180127/b4bc5860/attachment-0001.html>


More information about the Python-ideas mailing list