[pypy-dev] [PATCH] Improving readability of generated .c code

Tue Dec 14 15:14:13 CET 2010

Hi David,

On 12/14/2010 01:57 AM, David Malcolm wrote:
> The attached patch is an attempt at extending the C code generator so
> that it annotates the generated .c code with file/line information for
> the corresponding Python sources.
>
> Method:
> I followed the instructions in pypy/doc/getting-started-dev.txt:
>      cd pypy
>      python bin/translatorshell.py
>      >>>  t = Translation(snippet.is_perfect_number)
>      >>>  t.annotate([int])
>      >>>  f = t.compile_c()
> then examine the generated .c code.
>
> With this patch, the top of a generated .c function contains an inline
> copy of the python source, like this:
> bool_t pypy_g_ll_issubclass__object_vtablePtr_object_vtablePtr(struct pypy_object_vtable0 *l_subcls_0, struct pypy_object_vtable0 *l_cls_0) {
> 	bool_t l_v11; long l_v12; long l_v13; long l_v14;
> 	/* Python source '/home/david/coding/pypy-svn-anon/trunk-clean/pypy/rpython/lltypesystem/rclass.py'
> 	 *  665 : def ll_issubclass(subcls, cls):
> 	 *  666 :     return llop.int_between(Bool, cls.subclassrange_min,
> 	 *  667 :                                   subcls.subclassrange_min,
> 	 *  668 :                                   cls.subclassrange_max)
> 	 */
> 	goto block0;
>
>
> ...and for every operation that has an offset, it calculates the
> linenumber (using the "dis" module on the code object).  For every
> operation that changes the line number, it emits a comment containing
> that Python source, so that you get code like this:
>
>      block0:
> 	/*  666 :     return llop.int_between(Bool, cls.subclassrange_min, */
> 	l_v12 = RPyField(l_cls_0, ov_subclassrange_min);
> 	/*  667 :                                   subcls.subclassrange_min, */
> 	l_v13 = RPyField(l_subcls_0, ov_subclassrange_min);
> 	/*  668 :                                   cls.subclassrange_max) */
> 	l_v14 = RPyField(l_cls_0, ov_subclassrange_max);
> 	OP_INT_BETWEEN(l_v12, l_v13, l_v14, l_v11);
> 	goto block1;
>
> Hopefully this makes the generated .c much more readable.

I like the general approach, though I have the vague fear that it will 
make the generated C code much bigger. Could you check that this is not 
the case?

> (I hope to
> work on the names of locals also; ideally they ought to embed the python
> indentifier)

This should already work in theory, see e.g. "l_cls_0" above, which 
embeds the Python variable name "cls". I guess that the information is 
simply lost in some of the transformation steps. If you feel like 
hunting down where, you can probably get better naming in some cases.

> This is on a Fedora 13 x86_64 box, with cpython 2.6.5
>
> Caveats:
>    - this is my first patch to PyPy (please be gentle!): I hope I've
> correctly understood the various levels, but I suspect I may still be
> missing things at the conceptual level

I think you are doing great. You might have to grep around a bit for 
more places that create SpaceOperations, or did you look at all of them?

>    - haven't completed the full unit tests yet.

This is the biggest problem of the patch, of course. I think that you 
should at least try to write tests for each of the places where you 
propagate the offset down one level. In addition, you probably need an 
integration test where you check the generated C code for the presence 
of the correct comments.

>    - the patch attempts to propagate the "offset" of flowspace
> SpaceOperation instances in a couple of places where the offset was
> being discarded:
>        - in BaseInliner.copy_operation

This is wrong. If you inline one function into another, it doesn't make 
sense to put the offset of the inner function into the outer function. 
To solve this problem, you have two options:
  1) when inlining, set the offset to -1, and just don't get source code 
then
  2) make every *operation* contain a reference to the func it comes 
from. The memory implications of this should be checked, but are 
probably not too bad.

>        - when generating operations in rtyper: set each of the new
> operations' "offset" to that of the high-level operation

That makes sense.

>    - I added the offset to SpaceOperation.__repr__() to help track the
> above down.
>    - I also pass the reference to the func object from the flowspace
> FunctionGraph to the rtyper's copy (I wonder if this affect memory usage
> during translation?)

Probably not enough to care.

>    - most functions appear correct, but there are some where I'm not at
> all convinced by the output (see below)
>    - the patch is assuming that within any given generated C function,
> there is a single python function, and that the offsets are indexes into
> the bytecode for that function's code object.  I suspect that this is an
> oversimplification, and why I'm seeing the errors that I'm seeing.
>    - I haven't stripped some of my debugging python comments from
> funcgen.py in the patch (this is a work in progress).

Now that we switched to mercurial, you can probably simply make a branch 
and work there: http://bitbucket.org/pypy/pypy/overview

>    - I haven't tried doing this on a full build of the interpreter
> (having highly readable generated .c for this is my objective [1]).

I fear it will never be "highly readable", but I think your work 
improves things already.

> One other idea I had was to sort the blocks in the function by the
> bytecode offset; currently the c blocks seem to "jump around" a fair bit
> relative to the corresponding rpython code.  Has any tuning been done on
> the ordering of the blocks within the generated .c code?  (or is it
> assumed that the .c compiler will "flatten" these, for the case when a
> node has a single successor node?)

The C compiler should not care about the order of blocks. If it does, I 
am prepared to call it insane.

> (I wonder if similar work could be done on the JVM and CLI code
> generators?)
>
> As mentioned above, I think my patch is getting it wrong in a few
> places.  Here's an example:
> struct pypy_rpy_string0 *pypy_g_mallocstr__Signed(long l_length_3) {
>
>     ...snip...
>
> 	/* Python source '/home/david/coding/pypy-svn-anon/trunk-clean/pypy/rpython/lltypesystem/rstr.py'
> 	 *   35 :     def mallocstr(length):
> 	 *   36 :         ll_assert(length>= 0, "negative string length")
> 	 *   37 :         r = malloc(TP, length)
> 	 *   38 :         if not we_are_translated() or not malloc_zero_filled:
> 	 *   39 :             r.hash = 0
> 	 *   40 :         return r
> 	 */
>
>     ...snip...
>
>      block15:
> 	/*   37 :         r = malloc(TP, length) */
> 	OP_ADR_SUB(l_v202, 0, l_v234);
> 	l_v235 = (struct pypy_header0 *)l_v234;
> 	l_v236 = RPyField(l_v235, h_refcount);
> 	/*   38 :         if not we_are_translated() or not malloc_zero_filled: */
> 	OP_INT_ADD(l_v236, 1L, l_v237);
> 	RPyField(l_v235, h_refcount) = l_v237;
> 	goto block5;
>
> The C code doesn't look anything like the .py code that my patch is
> inserting in the above.  (My current guess is that I need to be smarter
> about inlined operations, though that's a guess at this stage)

Sounds like a correct guess, malloc is transformed into something else, 
and the offset of the inlined operations is basically just random, if 
you interpret it in the context of the outer function.

Cheers,

Carl Friedrich