[Python-ideas] Why CPython is still behind in performance for some widely used patterns ?
Steven D'Aprano
steve at pearwood.info
Fri Jan 26 19:25:35 EST 2018
On Fri, Jan 26, 2018 at 02:18:53PM -0800, Chris Barker wrote:
[...]
> sure, but I would argue that you do need to write code in a clean style
> appropriate for the language at hand.
Indeed. If you write Java-esque code in Python with lots of deep chains
obj.attr.spam.eggs.cheese.foo.bar.baz expecting that the compiler will
resolve them at compile-time, your code will be slow.
No language is immune from this: it is possible to write bad code in any
language, and if you write Pythonesque highly dynamic code using lots of
runtime dispatching in Java, your Java benchmarks will be slow too.
But having agreed with your general principle, I'm afraid I have to
disagree with your specific:
> For instance, the above creates a function that is a simple one-liner --
> there is no reason to do that, and the fact that function calls to have
> significant overhead in Python is going to bite you.
I disagree that there is no reason to write simple "one-liners". As soon
as you are calling that one-liner from more than two, or at most three,
places, the DRY principle strongly suggests you move it into a function.
Even if you're only calling the one-liner from the one place, there can
still be reasons to refactor it out into a separate function, such as
for testing and maintainability.
Function call overhead is a genuine pain-point for Python code which
needs to be fast. I'm fortunate that I rarely run into this in practice:
most of the time either my code doesn't need to be fast (if it takes 3
ms instead of 0.3 ms, I'm never going to notice the difference) or the
function call overhead is trivial compared to the rest of the
computation. But it has bit me once or twice, in the intersection of:
- code that needs to be as fast as possible;
- code that needs to be factored into subroutines;
- code where the cost of the function calls is a significant
fraction of the overall cost.
When all three happen at the same time, it is painful and there's no
good solution.
> "inlining" the filter call is making the code more pythonic and readable --
> a no brainer. I wouldn't call that a optimization.
In this specific case of "if rule.x in whatever.x", I might agree with
you, but if the code is a bit more complex but still a one-liner:
if rules[key].matcher.lower() in data[key].person.history:
I would much prefer to see it factored out into a function or method. So
we have to judge each case on its merits: it isn't a no-brainer that
inline code is always more Pythonic and readable.
> making rule.x local is an optimization -- that is, the only reason you'd do
> it to to make the code go faster. how much difference did that really make?
I assumed that rule.x could be a stand-in for a longer, Java-esque chain
of attribute accesses.
> I also don't know what type your "whatevers" are, but "x in something" can
> be order (n) if they re sequences, and using a dict or set would be a much
> better performance.
Indeed. That's a good point.
--
Steve
More information about the Python-ideas
mailing list