Mailman 3 October 2004 - pypy-dev

bug in http://codespeak.net/moin/pypy/moin.cgi/FrontPage?action=show
by Laura Creighton 10 Oct '05

10 Oct '05

If you click on the 'documentation' link you get a traceback! ooops. I will look at this later unless somebody beats me to it. Laura

5 4

More on optimization
by Armin Rigo 07 Nov '04

07 Nov '04

Hi! Sorry for focusing the next sprint so much on translation. This might have put some people off. Well, lesson learned. It doesn't mean we should stop talking about translation :-) Pushing the previously dicussed ideas to their conclusion, we get an interesting point of view... First, for the context, the current situation in the repository: RPython code can be turned into a flow graph. Then the annotation pass on the flow graph infers types, with the side effect that it infers which functions call which functions, so it helps to build the complete list of functions that should be translated. Finally, genc.py produces naive C code using PyObject* only. The inferred types are not really used; the annotation phase is currently useful only to build the complete call graph. (Let's ignore the Lisp and Pyrex generators for the moment.) Here is an example with the flow graph's operations on the left and the corresponding C code (after macro expansion) on the right: v3 = add(v1, v2) v3 = PyNumber_Add(v1, v2); Quite obvious, and quite slow. Imagine instead that there is a C-ish low-level language, whose input "syntax" is of course flow graphs, with only operations like the following ones: * llcall(func, arg1, arg2...) # calls another function * lladd_int(x, y) # adds two ints and returns an int * lleq_ptr(x, y) # compares two pointers for equality * llstruct('name') # creates an empty structure in the heap * llget_int(x, 'key') # reads the field 'key' of struct object x * llset_int(x, 'key', value) # writes the field 'key' of struct object x * llget_ptr(x, 'key') # \ * llget_ptr(x, 'key', value) # / the same with fields containing a pointer The only data types would be "pointer to structure" and the atomic types like "int" and "char". A structure would be essentially like a dictionary, with either strings or integers as keys (a dict with int keys is much like a list, without the ability to insert or remove elements easily). This very limited set of operations allows interesting global analysis and optimizations, like removing the allocation of structures in the heap altogether, or not writing some fields of some structures when they have a known constant value, or (most commonly) not using a hash table but just a 'C struct' with fields when all the keys are known. It is easy to modify our regular flow graphs until they only use the above operations: we replace high-level operations like 'add' with calls to (or inlined copies of) support functions like: def rpyhon_add(v, w): if type(v) is int and type(w) is int: a = llget_int(v, 'ob_ival') b = llget_int(w, 'ob_ival') x = llstruct() llset(x, 'ob_type', int) llset(x, 'ob_ival', lladd_int(a, b)) return x else: return llcall('PyNumber_Add', v, w) Only such support functions can use the ll*() functions, which stand directly for the corresponding ll* operation. The ll*() functions themselves are not executable in the above Python source (actually they don't have to be callable objects at all). They are only meant as placeholders. If you wish, it's just an easy way to write down a complicated flow graph using ll* operations: instead of inventing a syntax and writing a parser for it, we use the Python syntax and the flowobjspace to build our flow graphs. Note that the type() used above is a function defined as: def type(v): return llget_ptr(v, 'ob_type') (Actually, we might want to directly execute such code, for testing, with Python dictionaries in most variables and (say) llget_ptr() defined as dict.__getitem__(), but that's not the main goal here.) Let's come back to PyPy. We can now replace each high-level operation with a call to a support function like the above one. Initially these support functions could contain only the llcall() to the CPython API; then we can progressively add more cases to cover all the RPython behavior. The goal would be that eventually the support functions and the optimizer become good enough to remove all calls to the CPython API. The optimizer can do that using global analysis: e.g. in the above rpython_add(), if both 'v' and 'w' come from a previous llstruct() where 'ob_type' was set to 'int', then we know that the first branch is taken. This is a kind of generalized type inference using global constant propagation; it would superscede half of the annotation pass' job. Also, the incremental nature of this approach is nice (and very Psyco-like, btw). It is good for testing along the way, and also means that when we reach the goal of supporting enough for RPython, the result is still nicely useful for non-RPython code; in this case, when types cannot be inferred, there are remaining calls to the CPython API. I think that this would be a widely useful tool! I also like a lot the idea of a low-level language and optimizer which is entierely independent from Python and PyPy. We could even (still) devise an assembly-like syntax and a parser for it, and make it a separate release. Such an optimizer is obviously not easy to write but I think it's quite worth the effort. Most of it would be independent on the target language: it would e.g. replace operations with still more precise operations describing the usage of the structs, like llheapalloc('C struct name') llheapget_int(x, 'field', 'C struct name') Also, we should look closely for related work. It looks like it *has to* have been done somewhere before... But the kind of global optimizations that we need might be different from the ones any reasonable language would need, because no one would reasonably write code like the rpython_add() above when he actually means to do a simple integer addition :-) What's essential here, after constant propagation, is to do liveness and alias analysis for the heap structures, so that they can be not allocated in the heap at all as often as possible. Also, if we want to do explicit memory management (i.e. we would have to use a 'llfree' operation) with reference counters, then the above example of rpython_add() becomes even longer -- but of course it should still be optimized away. A bientôt, Armin.

4 18

C keywords as parameters
by Anders Lehmann 21 Oct '04

21 Oct '04

I had some trouble getting /goal/translate_pypy to work for me today. The compiler barfed over : struct __pyx_obj_11entry_point_W_Root__8329f0 { PyObject_HEAD PyObject *typedef; }; The errors disapear if I insert the following pyxcode=pyxcode.replace('typedef','type_def') into line 219 of translalor.py (in the function compile). I wont check in the change as I dont think this is the right solution. Hope you will have a good sprint in Vilnius (I wont be able to make alas) Anders Lehmann

2 1

Re: [pypy-sprint] vilnius sprint planning progress
by Armin Rigo 18 Oct '04

18 Oct '04

Hi Holger, On Fri, Oct 08, 2004 at 12:18:44PM +0200, holger krekel wrote: > Armin, Samuele, it would be great if especially the two of you > (or at least one of you) think hard about how to split up work > for the translate-pypy goal so that multiple teams can work on > it. If you have ideas please post them to pypy-dev. As far as I can see now, this goal can be divided in three relatively independent tasks: A. Obtain the complete flow graph of the RPython part of PyPy. B. Perform type inference and optimizations on the flow graph. C. Generate C code from the flow graph. A. PyPy -> FlowGraph ==================== See goal/translate_pypy.py; run it and fix things until this script processes the whole of PyPy. This requires updating things in PyPy when they are not RPythonic enough. In particular, there are some more efforts to be done on "caches": lazily built objects. Generally, the code to build such objects is not RPython; so for RPython, the objects must be built in advance. The flow space must force these caches to be completely built. This part can be done independently. The goal would be to get a complete graph, which the existing genc.py can (mostly) already translate to C and run, for testing. (This will not be extremely fast because genc.py doesn't use type information now, but it should be a bit faster than the pure Python py.py.) B. FlowGraph -> Optimized FlowGraph =================================== Still open to discussion. What exactly should be done here, and how? An idea is that we could provide a set of rules that transform some operations according to the type inferred for their arguments. This would introduce new operations that work on individual fields of PyObjects instead of just calling PyObject_Xxx() on them. Global analysis can further discover when PyObject can be inlined into a parent, when they don't need reference counters, etc. C. Optimized FlowGraph -> C code ================================ See genc.py. This not-too-long piece of code translates a regular flow graph into C code, and it seems to work fine, mostly. There are a few missing RPython constructions (e.g. exception handling) that A will generate. In parallel, the optimizations introduced in B will produce flow graphs with new kinds of operations whose support must then be added to genc.py. So if you'd like to work on genc.py, people working on A and B will keep throwing at you new kind of flow graphs and optimization-related data to support. Armin

3 3

PyPy Vilnius Sprint 15-23 nov 2004
by hpk＠trillke.net 14 Oct '04

14 Oct '04

Hi Pythonistas and interested developers, PyPy, the python-in-python implementation, is steadily moving on. The next coding sprint will take place in Vilnius, Lithunia, from 15th to 23rd of November, 2004 and is organized by the nice Programmers of Vilnius (POV) company. See http://codespeak.net/pypy/index.cgi?doc for more in-depth information about PyPy. Again, we will be heading towards a first generated C version of our already pretty compliant Python interpreter and types implementation. Last time, before the EuroPython 2004 conference, we actually had a similar goal (a PyPy/C-version) but discovered we had to largely refactor the basic model for attribute accesses. We are now closely mirroring the marvelous "descriptor"-mechanism of CPython. If you are interested to participate in our fun and somewhat mind-altering python sprint event then please subscribe at http://codespeak.net/moin/pypy/moin.cgi/VilniusSprintAttendants and look around for more information. You'll find that most of the core PyPy developers are already determined to come. There are also many areas that need attention so that we should have tasks suited for different levels of expertise. At http://codespeak.net/moin/pypy/moin.cgi/VilniusSprintTasks you'll find our sprint planning task list which will probably grow in the next weeks. Note that our EU funding efforts are at the final stage now. In the next weeks, quite likely before the Vilnius sprint, we _hope_ to get a "go!" from the european commission. One side effect would be that coders - probably restricted to european citizens - may generally apply for getting travel and accomodation costs refunded for PyPy sprints. This would reduce the barrier of entry to the question if you like to spend your time with a pypy sprint. However, we probably need some time to work out the details after when we get more information from the EU. If you have any questions don't hesitate to contact pypy-sprint(a)codespeak.net or one of us personally. cheers & a bientot, Holger Krekel, Armin Rigo

1 0

important codespeak news
by hpk＠trillke.net 09 Oct '04

09 Oct '04

Hi everyone using services on codespeak, first of all, please don't simply reply to this mail because you may reply to a lot of mailing lists because this is a crossposting. However, it's probably the last crossposting. Why? See below. Jens-Uwe Mager and me are currently sitting on the new codespeak hardware and software setup. It will run on subversion 1.1 among other things and we plan to introduce proper SSL support, a trac based project environment, and more redundancy. Our first step, however, will be migrating the hardware. Now we recommend that everybody who works with codespeak services subscribes to our new announce mailing list: http://codespeak.net/mailman/listinfo/codespeak-ann This will from now on be the sole place where we announce news - including downtimes and important upgrades or problems - regarding codespeak. It is expected to be a very low volume list and it's readonly. For developing and discussing infrastructure enhancements please subscribe to our development list, located here: http://codespeak.net/mailman/listinfo/codespeak-dev This serves as the place to discuss introducing the trac project management environment, and other general improvements. We also welcome further help and suggestions and we can rather easily grant sysadmin-privs because we have full version control of the codespeak configuration files. For example, if you want to know more details regarding this announcements, you should subscribe to codespeak-dev and post your question. cheers, Holger Krekel & Jens-Uwe Mager

1 0

[Fwd: CfP: Bytecode05, Edinburgh, Scotland, UK, April 9, 2005]
by Grégoire Dooms 05 Oct '04

05 Oct '04

Call for Papers Bytecode05 The first Workshop on Bytecode Semantics, Verification, Analysis and Transformation April 9, 2005, Edinburgh, Scotland (co-located with ETAPS'05) www.sci.univr.it/~spoto/Bytecode05 - Aims and Scope of the Workshop: Bytecode, such as produced by e.g. Java and .NET compilers, has become a topic of interest, both for industry and academia. The industrial interest mainly stems from the fact that bytecode is typically used in critical environments, such as the Internet and smart cards. Moreover, an important characteristic for bytecode is that it is device-independent and allows dynamic loading of classes. For researchers that wish to apply formal methods to bytecode, this dynamic nature of bytecode provides an extra challenge. In addition, also the unstructuredness of the code and the pervasive presence of the stack provide extra challenges for the analysis of bytecode. This workshop will focus on the latest developments in the semantics, verification, analysis and transformation of bytecode. Both new theoretical results and tool demonstrations are welcome. - Program Committee: * Frederic Besson, IRISA, France * Etienne Gagnon, Universite du Quebec a Montreal, Canada * Marieke Huisman, INRIA Sophia Antipolis, France * Fausto Spoto, Universita di Verona, Italy (chair) * Don Syme, Microsoft Research, UK - Invited Speaker: * Xavier Leroy, INRIA Rocquencourt & Trusted Logic, France - Important Dates: * December 19, 2004: Paper submissions * January 16, 2005: Notifications to authors * January 30, 2005: Camera-ready * April 9, 2005: Workshop - Paper Submissions: Submissions will be evaluated by the Program Committee for inclusion in the proceedings, which will be available at the time of the workshop. Papers should be no longer than 15 pages. They must contain original contributions, be written in English and be unpublished and not submitted simultaneously for publication elsewhere. They should be submitted electronically, preferably as postscript or PDF files, to fausto.spoto(a)univr.it providing also a text-only abstract, and detailed contact information of the corresponding author. Proceedings are intended to be published in the ENTCS series (information for authors can be found at www1.elsevier.com/gej-ng/31/29/23/show/Products/notes/ENTCS/guide.htt). Thus, adhering to that style already in the submission phase is strongly encouraged. - Venue: The workshop will be held in Edinburgh, Scotland, UK, and co-located with ETAPS'05. _________________________________________________________________________________ mozart-users mailing list mozart-users(a)ps.uni-sb.de http://www.mozart-oz.org/mailman/listinfo/mozart-users You are invited to The 2nd International Mozart/Oz Conference (MOZ 2004) Charleroi, Belgium, Oct. 7-8, 2004 http://www.cetic.be/moz2004

1 0