[pypy-dev] How to translate 300000 lines of C

Christian Tismer tismer at tismer.com
Tue Jan 21 14:34:35 CET 2003


Rocco Moretti wrote:
> On another thread Christian Tismer <tismer at tismer.com> wrote:
> 
> 
>>I tried to map [frameobject.c] on a 3.5 hour yourney from
>>Kiel to Berlin, and I had one fourth done by an hour.
>>Nevertheless, I got into trouble, just by comparing its
>>implementation differences between 2.2.2 and 2.3a.
>>Then I dropped that and decided that this is the wrong way.
> 
> 
> I'm interested to know what problems you were encountering. 

Well, first of all, frameobject.c seemed to force
me to invent every necessary supporting objects at once,
or to drop them and maybe loose them.
The block stack was one thing that I would have
liked to describe with some struct construct,
and there is pointer arithmetic...
Well, this could have easily been replaced by
a list of tuples.

But well, what really got me stuck was the number of
changes and additions which came with 2.3a.
With a direct rewrite in Python, we get into a major
problem:
You cannot use diffs any longer. Diffing between two
C files is fine.
But what do you do if you have a hand-written version
of a C file? You have nothing to diff *that* against,
so you have to read the C diffs, repeat the mappings
which you did in brain and try to figure out what to
change in the target .py file. Huh!

That really made me stop the otherwise not-so-hard
transliteration and to think about how to avoid
the upcoming nightmare.
What I think is needed is a tool, that does the
translation partially automatically, partially
on my command, with some scripted rules.
Given that, I'm able to produce a python file
of a new C version, and diff the resulting Python
files against each other. Let it even be that this
diff contains C snippets which were'n automatically
mapped, but they are in the right place, hopefully.

> One issue I recall from frameobject.c is that it
> has a lot of optimizations regarding object
> caching and reuse. As a first approximation,
> we should probably ignore such optimizations.
> If we do add it in later, it should probably be
> as an optimization by psyco for *all* objects.

Yes, optimizations are spread all over the place,
and they don't help the translation, at least :-)

> That said, theoretically, I like the idea of
> a C->Py converter. (It did get tedious when I was doing it.)

> However, I'm concerned if the time invested to make one would be worth it. 
> 
> Christian Tismer <tismer at tismer.com> wrote:
> 
> 
>>There are a number of free-ware C compilers around, and also
>>some C interpreters.
> 
> 
> But coded in which language? How difficult would it
> be to get them to "play nice" with Python (which is,
> I am assuming, where you want to code the conversion
> logic)? - This question probably boils down to asking
> what compiler/interpreter you have in mind to co-opt.

I'm reading through lcc right now, just to get an idea,
and I've begun to code a small lexer and parser in
Python. The advantage of our problem is that we may
assume correct C code, so I don't have to do a validating
parser.

I have no idea yet, how the mapping should work,
and on which abstraction level. There are lots of
issues which can only emit an "untranslatable"
message, like labels and gotos.
After I have something useful, I will post the
tiny parser for playing with ideas.

> One caveat with me evaluating the benefits of this
> idea is that I don't have a feel how difficult tracking
> changes in CPython would be. We do have the changelog
> and Unit tests for CPython, so we wouldn't necessarily
> need to do a line-by-line comparison. We could approach
> the changes in more of a top-down level. Isn't Python
> supposed to be easier to maintain than C? 

Well, I hope comparing translated scripts does help
here. We have to try it anyway, tho.

> How does Jython do it?

No idea.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/



More information about the Pypy-dev mailing list