Mailman 3 December 2005 - pypy-dev

This Week in PyPy 5
by Michael Hudson 03 Dec '05

03 Dec '05

Introduction ============ This is the fifth of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on: http://codespeak.net/pypy/dist/pypy/doc/weekly/ where you can also find old summaries. I note in passing that the idea of keeping track of IRC conversations in the weekly summary has pretty much fizzled. Oh well. There were about 230 commits to the pypy section of codespeak's repository in the last week (a busy one, it seems :-). SomePBC-refactoring =================== We merged the branch at last! Finishing the branch off and getting translate_pypy running again seemed to mostly involve fighting with memoized functions and methods, and the "strange details" hinted at in the last "This Week in PyPy" were not so bad -- indeed once we got to the point of rtyping finishing, the backend optimizations, source generation, compilation and resulting binary all worked first time (there must be something to be said for this Test Driven Development stuff :). If you recall from the second This Week in PyPy the thing that motivated us to start the branch was wanting to support multiple independent object spaces in the translated binary. After three weeks of refactoring we hoped we'd made this possible... and so it proved, though a couple of small tweaks were needed to the PyPy source. The resulting binary is quite a lot (40%) bigger but only a little (10%) slower. CCC papers ========== As mentioned last week, two PyPy talks have been accepted for the Chaos Communication Congress. The CCC asks that speakers provide papers to accompany their talks (they make a proceedings book) so that's what we've done, and the results are two quite nice pieces of propaganda for the project: http://codespeak.net/pypy/extradoc/talk/22c3/agility.pdf http://codespeak.net/pypy/extradoc/talk/22c3/techpaper.pdf It's still possible to attend the conference in Berlin, from December 27th to the 30th: http://events.ccc.de/congress/2005 A number of PyPy people will be around and innocently mixing with people from other communities and generally be available for discussing all things PyPy and the future. Background EU-related work ========================== Less visible but still requiring work, organisations funding and organizing the EU PyPy project are currently preparing a lot of paperwork and reports. Most of the reports are mostly done by now but the next Göteborg sprint will start with two (insider only) days of dotting the 'i's and crossing the 't's. Let's all hope that everything goes well at our first major EU review at the end of January. Meanwhile, Holger was invited to give a talk about PyPy's technical organisation at a workshop given by the german EU office on the 5th of December. Also, Bea, Alastair and Holger will talk about PyPy at an EU workshop on the 8th of December in Bruxelles. Hopefully, this will enable us to find more opportunities to get PyPy recognized as an interesting "live" project in the EU's corner of the world. Where did PyPy-sync go? ======================= What's a pypy-sync meeting? Apparently:: It's an XP-style meeting that serves to synchronize development work and let everybody know who is working on what. It also serves as a decision board of the PyPy active developers. If discussions last too long and decisions cannot be reached they are delegated to a sub-group or get postponed. pypy-sync meetings usually happen on thursdays at 1pm CET on the #pypy-sync IRC channel on freenode, with an agenda prepared beforehand and minutes posted to pypy-dev after the meeting. Except that the last couple haven't really happened this way -- no agenda and only a few people have turned up and mostly just the people who are in #pypy all week anyway. So after the Göteborg sprint next week we're going to try harder to prepare and get developers to attend pypy-sync meetings again. This is especially important as we head towards more varied and less intrinsically related challenges such as a JIT compiler, integration of logic programming, GC, higher level backends and much more. -- Check out the comments in this source file that start with: # Oh, lord help us. -- Mark Hammond gets to play with the Outlook object model

1 0

Pascal backend
by Sanghyeon Seo 02 Dec '05

02 Dec '05

I started Pascal backend for a while now. Have a look at: http://sparcs.kaist.ac.kr/~tinuviel/pypy/ Resulting Pascal source can be compiled with FPC(Free Pascal Compiler), available from http://www.freepascal.org/ What do you think? Also, if I understood correctly, translation part is undergoing much change in the branch. What do I need to change? Seo Sanghyeon

2 1

Full Python Annotation
by Adam Olsen 01 Dec '05

01 Dec '05

I'm a bit disorganized here, but I don't think delaying will help much as my brain is turning to mush. Nothing ventured... Notes for description of value annotator Iterative. Allows examination of individual steps Stable. Produces result in finite amount of time Order Insensitive. Always produces the same result, no matter what order in which it processes the sourcecode Specialization of builtins. All functions are given a "space depth". Multiple calls (at different lines of code) to the same function will, if it has a deeper depth, create separate copies of the function in their own "space". If those functions then call a function of a shallower depth they will use a shared function (rather than continuing to specialize it). The result is that separate calls to builtin functions (such as the plus operator) will get separate annotations, but without risk of an infinite loop in the annotation process. Values. A Value is a set (in the mathematical sense) of possible values, be they integer, float, string, or whatever. In practice it must store an approximation so that a set of 0..2**64-1 can be represented. It is acceptable for the set to be expanded to include values that are not really possible (as the program will still behave correctly), but not acceptable to be smaller. Note however that as an iterative process is used to determine values, it will initially be to small, and later replaced with new Values that cover the full range. Hypothetical value expansions. A mechanism is needed to expand values to their full range quickly, rather than iterate it one step at a time. To do this an integer_add will notice on the second pass (not the first!) that it had a previous possible output, and that the new output appears to create a sequence from it. A new value set is created as normal (increased by only a single step), except a "hypothetical" part is added to it. The hypothetical contains a Value set with the predicted sequence of values (not including any already known), as well as a reference to the node/space that created it. The value is then set as the new output for the integer_add node and is propagated through the rest of the code. Other nodes will modify the hypothetical as if it was a normal value set, including putting a cap on the the highest value in the case of an if branch, but they will not explore new code as a result of it. If the hypothetical reaches the node/space that created it then this indicates there is a loop which allows the value to expand over the whole range, and the node will convert it into a normal value (including any caps that were discovered). Branches and bool replacement Values. To effect a cap on the range of a value the bool value produced by lessthan and similar operations will contain a pair reduced Value associated with the input variables. If the bool is used for a branch then either the "True" or "False" version of the reduced Value associated with the bool's original input variable will be used to replace the variable thereafter. Note that a form of Single Static Assignment is used to allow a variable to have different values at different points in a function. Comment on SSA and aliases. As the branch that reduces a variable's range may happen outside the function itself, it is necessary to make the SSA alias lists globally available and allow modification at any point. This means they must be copied to every node, with the resulting cost being quadratic (O(n**2)). However, as most of the copies would have identical contents I believe it would be possible to optimize this and get performance that is closer to linear (O(n)). Finally, a note on what sorts of code this is aimed at. Anything where all the code can be determined at compiletime, ie no eval/exec/__import__/reload. Some analysis of code that does use those features would still be possible however. Spaces diagram: http://members.shaw.ca/rhamph/python-annvalue/spaces.pdf Values diagram: http://members.shaw.ca/rhamph/python-annvalue/values.pdf Partially annotated test "program": http://members.shaw.ca/rhamph/python-annvalue/annotation.pdf My sourcecode (requires autopath.py): http://members.shaw.ca/rhamph/python-annvalue/basicblock.py -- Adam Olsen, aka Rhamphoryncus

2 1