On Wed, Aug 8, 2012 at 3:05 PM, Armin Rigo <arigo@tunes.org> wrote:
Hi Tim,

An additional remark: if you're looking for a tool that is able to
extract a complete call graph from a random Python program, then it's
impossible.  Only approximations can be done, like e.g. done in
pylint, I believe.

Such tools are pointless in PyPy but could be useful in other
projects.  For example I can imagine a tool that would statically
compile some version of a random Python program, including all
necessary guards to check at run-time that the assumptions made are
correct.  When these guards fail, it would fall back to a regular
interpreter.

The difference with PyPy is that the latter uses a tracing JIT
compiler to do at run-time (partially) the same job as I describe
above.  It does it by observation of the run-time behavior, which is a
rather better estimator of the program's general behavior than a
complex static analysis.  And it dispenses us from writing any such
analysis.  This gives us a result that is not specific to a particular
version of Python (or to Python at all).


A bientôt,

Armin.
 
Hi Armin,

Thanks for the lengthy reply.

Fortunately, I don't need a complete call graph. To get started I really only need procedure level dependency graphs (with control and data-dependencies). This graph is essentially a CFG with extra edges for different types of data dependencies. Thanks to Alex's previous reply I am taking a look at the CPython bytecode compiler to see if I can extract the local CFG's it produces.

I could see a static approach potentially beating the dynamic approach prior to the jit getting *hot* but after that it seems like the dynamic approach would win in the long run. Luckily, I don't have to worry about such things as I have no intention of writing a compiler (well for python anyway you guys seem to have that under control), my main interest lies in studying the evolution of the structure of the programs themselves.

Thanks for the pointers!

-Tim