[pypy-issue] [issue1220] Improving readability of generated .c code

Dave Malcolm tracker at bugs.pypy.org
Thu Jul 19 03:25:20 CEST 2012

New submission from Dave Malcolm <dmalcolm at redhat.com>:

I'm attaching a patch which I believe significantly improves the readability of 
the C code that the translator emits.

Specifically, the patch:
  * adds comments to the generated C showing the corresponding RPython source 
code, *including* that of inlined functions.
  * attempts to reduce the spaghetti-like gotos of the blocks in a function by 
replaceing "goto" to a block with no 
predecessors with the block itself.  Doing so constructs a pleasing hierarchical 
structure that more closely resembles 
human-written sources.

This is a followup to an old mailing list post:
which covered showing the RPython sources in the generated C.
I've carried a similar patch to the one given there within my Fedora/EPEL PyPy 
rpms (see [1]), in which I tried to 
implement the source code for inlining by trying to capture a "source code 
location" for an operation as a stack of actual 
source locations (corresponding to inlining).
[1] http://pkgs.fedoraproject.org/gitweb/?p=pypy.git;a=blob;f=more-readable-c-

However it never worked well, and not all operations are actually associated 
with source code e.g.:
  * "same_as" no-ops (see 
  * gencapicall() within rtyper.py (e.g. generated "PyInt_AsLong()" call to get 
at a boxed int)
  * operations added by gctransform (e.g. reference counting, by 
  * exceptions added for handling exceptions 

The alternative approach I came up with is to add a new kind of operation: 
OP_COMMENT/"comment", a no-op, added when we 
build the graph, and which gets turned into a comment in the generated source, 
which copes with inlining nicely (However, 
need to be careful not to thwart optimizations - for example, an earlier version 
of this patch defeated the switch-building 
detection in merge_if_blocks due to the extra ops).

My initial aim was to add one each time we change source line in the simple 
case, but to also add them for other 
transformations as appropriate.

I tried a few different places in which to inject the comment ops:
  * pypy.objspace.flow.flowcontext.BlockRecorder.bytecode_trace()
  * pypy.objspace.flow.flowcontext.FlowExecutionContext.bytecode_trace()
  * pypy.objspace.flow.objspace.FlowObjSpace.do_operation()
Doing it within one of the bytecode_trace() methods leads to very large numbers 
of comments (one per bytecode, whereas most 
bytecodes don't seem to directly generate SpaceOperations).  I tried reducing 
the number of comments by only emitting a 
comment when the line number changes, but I couldn't find a good place to store 
the current line: if I'm reading things 
correctly the flow objspace creates large numbers of small blocks which then get 

Adding them within do_operation() means that we only get one comment per 
"actual" SpaceOperation, so I went with this 

However, it means that we get non-equal results between the Recorder and 
Replayer classes, but I fixed this by filtering 
out comment ops when comparing the ops seen by Recorder and Replayer.

I added a new pass to simplify.py, to prune the comments per-block after the 
flowgraph is built, at the point where 
something resembling the final block structure has been reached - see the 
comment in the new pass.

Caveat: I haven't yet run the full test suite; doing that now

files: more-readable-c-code-2012-07-18-001.patch
messages: 4598
nosy: dmalcolm, pypy-issue
priority: feature
status: unread
title: Improving readability of generated .c code

PyPy bug tracker <tracker at bugs.pypy.org>

More information about the pypy-issue mailing list