Re: [Speed] Analysis of a Python performance issue

On Sat, Nov 19, 2016 at 02:32:26AM +0100, Victor Stinner wrote:
Hi,
I'm happy because I just finished an article putting the most important things that I learnt this year on the most silly issue with Python performance: code placement.
https://haypo.github.io/analysis-python-performance-issue.html
I explain how to debug such issue and my attempt to fix it in CPython.
I hate code placement issues :-) I hate performance slowdowns caused by random unrelated changes...
Victor
Thanks *a lot* victor for this great article. You not only very accurately describe the method you used to track the performance bug, but also give very convincing results.
I still wonder what the conclusion should be:
(this) Micro benchmarks are not relevant at all, they are sensible to minor factors that are not relevant to bigger applications
There is a generally good code layout that favors most applications? Maybe some core function from the interpreter ? Why does PGO fails to ``find'' them?
Serge

Le 19 nov. 2016 21:29, "serge guelton" sguelton@quarkslab.com a écrit :
Thanks *a lot* victor for this great article. You not only very accurately describe the method you used to track the performance bug, but also give very convincing results.
You're welcome. I'm not 100% sure that adding the hot attrbute makes the performance of call_method reliable at 100%. My hope is that the 70% slowdown doesn't reoccur.
I still wonder what the conclusion should be:
- (this) Micro benchmarks are not relevant at all, they are sensible to
minor
factors that are not relevant to bigger applications
Other benchmarks had peaks: logging_silent and json_loads. I'm unable to say if microbenchmarks must be used or not to cehck for performance regression or test the performance of a patch. So I try instead to analyze and fix performance issues.
At least I can say that temporary peaks are higher and more frequent on microbenchmark.
Homework: define what is a microbenchmark :-)
- There is a generally good code layout that favors most applications?
This is an hard question. I don't know the answer. The hot attributes put tagged functions in a separated ELF section, but I understand that inside the section, order is not deterministic.
Maybe the size of a function code matters too. What happens if a function grows? Does it impact other functions?
Maybe some core function from the interpreter ?
I chose to only tag the most famous functions of the core right now. I'm testing tagging functions of extensions like json but I'm not sure that the result is significant.
Why does PGO fails to ``find'' them?
I don't use PGO on speed-python.
I'm not sure that is PGO is reliable neither (reproductible performance).
Victor

I think it's safe to not reinvent the wheel here. Some searching gives: http://perso.ensta-paristech.fr/~bmonsuez/Cours/B6-4/Articles/papers15.pdf http://www.cs.utexas.edu/users/mckinley/papers/dcm-vee-2006.pdf https://github.com/facebook/hhvm/tree/master/hphp/tools/hfsort
Pyston takes a different approach where we pull the list of hot functions from the PGO build, ie defer all the hard work to the C compiler.
On Sat, Nov 19, 2016 at 12:29 PM, serge guelton sguelton@quarkslab.com wrote:
On Sat, Nov 19, 2016 at 02:32:26AM +0100, Victor Stinner wrote:
Hi,
I'm happy because I just finished an article putting the most important things that I learnt this year on the most silly issue with Python performance: code placement.
https://haypo.github.io/analysis-python-performance-issue.html
I explain how to debug such issue and my attempt to fix it in CPython.
I hate code placement issues :-) I hate performance slowdowns caused by random unrelated changes...
Victor
Thanks *a lot* victor for this great article. You not only very accurately describe the method you used to track the performance bug, but also give very convincing results.
I still wonder what the conclusion should be:
- (this) Micro benchmarks are not relevant at all, they are sensible to
minor factors that are not relevant to bigger applications
- There is a generally good code layout that favors most applications? Maybe some core function from the interpreter ? Why does PGO fails to ``find'' them?
Serge
Speed mailing list Speed@python.org https://mail.python.org/mailman/listinfo/speed
participants (3)
-
Kevin Modzelewski
-
serge guelton
-
Victor Stinner