I think it's safe to not reinvent the wheel here.  Some searching gives:
http://perso.ensta-paristech.fr/~bmonsuez/Cours/B6-4/Articles/papers15.pdf
http://www.cs.utexas.edu/users/mckinley/papers/dcm-vee-2006.pdf
https://github.com/facebook/hhvm/tree/master/hphp/tools/hfsort

Pyston takes a different approach where we pull the list of hot functions from the PGO build, ie defer all the hard work to the C compiler.

On Sat, Nov 19, 2016 at 12:29 PM, serge guelton <sguelton@quarkslab.com> wrote:
On Sat, Nov 19, 2016 at 02:32:26AM +0100, Victor Stinner wrote:
> Hi,
>
> I'm happy because I just finished an article putting the most
> important things that I learnt this year on the most silly issue with
> Python performance: code placement.
>
> https://haypo.github.io/analysis-python-performance-issue.html
>
> I explain how to debug such issue and my attempt to fix it in CPython.
>
> I hate code placement issues :-) I hate performance slowdowns caused
> by random unrelated changes...
>
> Victor

Thanks *a lot* victor for this great article. You not only very
accurately describe the method you used to track the performance bug,
but also give very convincing results.

I still wonder what the conclusion should be:

- (this) Micro benchmarks are not relevant at all, they are sensible to minor
  factors that are not relevant to bigger applications

- There is a generally good code layout that favors most applications?
  Maybe some core function from the interpreter ? Why does PGO fails to
  ``find'' them?

Serge

_______________________________________________
Speed mailing list
Speed@python.org
https://mail.python.org/mailman/listinfo/speed