
My long posts were intended, in part, to expose my assumptions for correction if needed. Here are what I conceive to be the key questions about psyco: 1. How often and under what circumstances does psyco_compatible get called? My _guess_ is that it gets called once per every invocation of every "psycotic" function (function optimized by psyco). Is this correct? 2. True or false: the call to psyco_compatible would be equivalent to runtime code that discovers special values of certain particular variables. 3. True or false: adding more state information to psyco (in order to discover more runtime values) will slow down psyco_compatible. 4. Are these the most important questions to ask about psyco? If not, what _are_ the key questions? Thanks very much. Edward -------------------------------------------------------------------- Edward K. Ream email: edream@tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html --------------------------------------------------------------------

Hello Edward, On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote:
No: psyco_compatible() is only called at compile-time. When a "psycotic" function is called by regular Python code, we just jump to machine code that starts at the beginning of the function with no particular assumption about the arguments; it just receive PyObject* pointers. Only when something more about a given argument is needed (say its type) will this extra information be asked for. The corresponding machine code is very fast in the common case: it loads the type, compares it with the most common type found at this place, and if it matches, runs on. So in the common case, we only have one type check per needed argument. Given def my_function(a,b,c): return a+b+c the emitted machine code looks like what you would obtain by compiling this: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { int r1, r2, r3; if (a->ob_type != &PyInt_Type) goto uncommon_case; if (b->ob_type != &PyInt_Type) goto uncommon_case; if (c->ob_type != &PyInt_Type) goto uncommon_case; r1 = ((PyIntObject*) a)->ob_ival; r2 = ((PyIntObject*) b)->ob_ival; r3 = ((PyIntObject*) c)->ob_ival; return PyInt_FromLong(r1+r2+r3); } Only when a new, not-already-seen type appears does it follow the "uncommon_case" branch. This triggers more compilation, i.e. emission of more machine code. During this emission, we make numerous calls to psyco_compatible() to see if we have reached a state that we have already seen, and which subsequently corresponds to already-emitted machine code; if it does, we emit a jump to this old code. This is the purpose of psyco_compatible(). I must mention that in the above example, the nice-looking C version is only arrived at after several steps of execution mixed with further compilation. The first version is: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { goto uncommon_case; /* need a->ob_type */ } Then when the function is first called with a integer in 'a', it becomes: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { if (a->ob_type != &PyInt_Type) goto uncommon_case; goto uncommon_case; /* need b->ob_type */ } and so on.
2. True or false: the call to psyco_compatible would be equivalent to runtime code that discovers special values of certain particular variables.
See above.
3. True or false: adding more state information to psyco (in order to discover more runtime values) will slow down psyco_compatible.
This is true. The more you run-time values you want to "discover" (I say "promote to compile-time"), the more versions of the same code you will get, and the slower psyco_compatible() will be (further slowing down compilation, but not execution proper, as seen above).
4. Are these the most important questions to ask about psyco? If not, what _are_ the key questions?
Hard to say! I like to mention the "lazy" values ("virtual-time"). These are the key to high-level optimizations in Psyco. In the above example you might have noticed that the Python interpreter must build and free an intermediate integer object for "a+b" when computing "a+b+c", while the C version I showed does not. Psyco does this by considering the intermediate PyObject* pointer as lazy. As long as it is not needed, no call to PyInt_FromLong() is written; only the value "r1+r2" is computed. Similarily, in "a+b", if both operands are strings, the result is a lazy string which is implemented as a lazy list "[a,b]". Concatenating more strings turns the list into a real Python list, but the resulting string itself is still lazy. This is how Psyco end up automatically translating things like s = '' for t in xxx: s += t into something like lst = [] for t in xxx: lst.append(t) s = ''.join(lst) I hope that these examples cast some light on Psyco. I realize that this could distract people from the current goals of this project, and I apologize for that. We should discuss e.g. "how restricted" the language we use for Python-in-Python should be... A bientot, Armin.

From: "Armin Rigo" <arigo@tunes.org>
at least for method dispatching/lookup some kind of sampling in the spirit of polymorphic inline caches can help distinguish whether there are a one to few relevant cases that are worth to specialize for, or e.g. implementing dispatch with simply a monomorphic inline cache has a more reasonable (especially in space) price.

Armin Rigo wrote: ...
[snipped all he good rest] Just a little comment. The above is what I like so much about the Psyco ideas. Now consider the huge eval_code function, wih its specializations in order to make operations on integers very fast, for example. With Psyco, these are no longer necessary, since Psyco will find them by itself and create code like the above from alone. As another point, when re-implementing the Python core objects in Python, there are many internal functions which are called by the interpreter, only. The datatypes pssed to those functions will be almost always the same, and since the functions aren't exposed otherwise, the first time they are called will create their final version, and the uncommon_case can be dropped completely. We just need to "seed" them with appropriate primitive data types, and the whole rest can be deduced with ease. That's what I eagerly want to try and to see happen :-)
Sorry, I couldn't resist it. I will start to ask some questions in a different thread. ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Any interest in getting rid of eval() altogether? phone:306.653.4747 fax:306.653.4774 http://www.zu.com

I mean the builtin. I didn't realise there was an eval() in C (I thought the function name(s) was different). On Monday, January 20, 2003, at 01:29 PM, holger krekel wrote:
-- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com

The only reason I bring it up, and I'm not really in any position to be bringing things up, is that it seems to me that this dynamic metacompiling stuff could be a real pain for making things fast and optimized. Most fast languages don't have eval() and that may be part of the reason they are fast. I'm sure Guido could jump in with a wonderful reason why eval() is great but I think if it disappeared no one would miss it, especially if it let Python compile to machine code. In fact if that was the price of enabling Python to compile to machine code I *guarantee* no one would miss it. On Monday, January 20, 2003, at 02:03 PM, holger krekel wrote:
-- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com

Hello Armin,
Many thanks for this most interesting and informative reply. It clears up a lot of my questions. I feel much more free to focus on the big picture.
Yes. I have been focusing on an "accounting" question: how often does the compiler run? If the compiler starts from scratch every time a program is run, then I gather from your example that the compiler will be called once for every type of very argument for every executed function _every time the program runs_. Perhaps you are assuming that the gains from compiling will be so large that it doesn't matter how often the compiler runs. Yesturday I realized that it _doesn't matter_ whether this assumption is true or not. Indeed, suppose that we expand the notion of what "byte code" is to include information generated by the compiler: compiled machine code, statistics, requests for further optimizations, whatever. The compiler could rewrite the byte code in order to avoid work the next time the program runs. Now the compiler runs less often: using exactly the same scheme as before the compiler will run once for every type of very argument for every executed function _every time the source code changes_. This means that no matter how slowly the compiler runs the _amortized_ runtime cost of the compiler can be made to be asymptotically zero! This is an important theoretical result: the project can never fail due to the cost of compilation. This result might also allow us to expand our notion of what is possible in the implementation. You are free to consider any kind of algorithm at all, no matter how expansive. For instance, there was some discussion on another thread of a minimal VM for portability. Maybe that vm could be the intermediate code list of gcc? If compilation speed isn't important, the "compiler" would simply be the front end for gcc. We would only need to modify the actual emitters in gcc to output to the "byte code". We get all the good work of the gcc code generators for free. Retargeting psyco would be trivial. These are not proposals for implementation, and certainly not requests that you modify what you are planning to do in any way. Rather, they are "safety proofs" that we need not be concerned about compilation speed _at all_, provided that you (or rather Guido) is willing to expand the notion of the "byte code". This could be done whenever convenient, or never. The point is that my worries about the cost of compilation were unfounded. Compilation cost can never be a "gotcha"; a pressure-relief value is always available. Perhaps this has always been obvious to you; it wasn't at all clear to me until yesterday. Edward

At 22:40 2003-01-18 -0600, Edward K. Ream wrote:
Well, if most everything is written in python, with all the libraries etc., I think there still is some accounting to do. I.e., library modules won't change much after having been exercised a bit. Will this keep updating cached info in .pyc (or maybe new .pyp) files? (BTW, IWT that makes for eventual permissions issues, if it's shared libraries. Do you just get per-use caches, etc.?) IMO there has to be some way of not rebuilding the world a lot, even if it's fast ;-) I'm also picking this place to re-introduce the related "checkpointing" idea, namely some call into a builtin that can act like a yield and save all the state that the compiler/psyco etc have worked up. Perhaps some kind of .pyk for (python checkpoint) file that could resume from where the checkpoint call was. I believe it can be done if the interpreter stack(s) is/are able to be encapsulated and a little restart info can be stored statically and the machine stack can unwind totally out of main and the C runtime exit, so that coming back into C main everyting can be picked up again. I don't want to belabor it, but just mention it as something to consider, in case it becomes easy when you are redesigning the VM and its environment. Obviously there has to be some restrictions on state, but I think if we could wind up with fast-load application images in the future because you kept this in mind in the beginning, it could be a benefit. Regards, Bengt Richter BTW[OT], sorry about the justification. Now I can't get it back without retyping or writing a re-spacing ragged wrapper.

Good question. Here is something I sent privately to Christian: [starts] To run a program: [using .pyp files] 1. Load the byte code, performing any queued requests for optimizations using stored data. In general, there will be no such requests after the first few runs. Obviously, changing the Python source code throws out some or all of this intermediate data. 2. Run the code, doing some optimizations immediately, possibly requesting other optimizations to be done later, and storing any useful data in the "extended byte code". At first the code will be intepreted/git'ed. After a few runs there will be nothing left but globally optimized machine code. It doesn't get any better than this. In short, the code gets faster the more it is executed. This approach should make bootstrapping easy. You could even start with an interp that just queues up optimization requests for the next load time! [ends] So I am thinking that the only time this machine code in the byte code (.pyp file) changes is if the jit/interpreter/psyco/whatever-you-call-it sees an object with a type that it has never seen before. After a very few runs (of any particular .pyp file) the cached data becomes nothing but machine code (with branches to uncommon_case that never get taken). I see the situation as being similar to a peephole optimizer: it takes only 2 or 3 iterations to perform all possible optimizations. Since the Python code of libraries and other "system" code never changes (or hardly ever changes), we should be ok. Edward -------------------------------------------------------------------- Edward K. Ream email: edream@tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html --------------------------------------------------------------------

Hello Bengt, On Sat, Jan 18, 2003 at 11:25:52PM -0800, Bengt Richter wrote:
This might probably be done without Psyco, and would certainly be a nice thing to have. Note that a good Psyco could remove any need for it: most initialization code could theoretically be specialized into something that just creates the necessary data structures without executing any code at all. Sometimes I like to point out that if our OS were written in a high-level language with built-in specializers, they would boot in no more than the time it takes to do the actual I/O that occurs when booting (mainly displaying the login screen and waiting for mouse, keyboard and network input) --- everything else is internal state and can be done lazily. A bientot, Armin.

Hello Armin, At 10:50 2003-01-19 -0800, Armin Rigo wrote:
There seems to be something I missed. Could you clarify how such specialized versions persist so they don't have to be redone? I.e., how do you get from an original .py source-only representation to the specialized form, and how does the latter come to exist? I.e., is this a new form of incrementally updated .pyc?
If this means dynamic incremental revisions of system files, it must be a whole new class of security issues to nail down, or am I misconstruing? Regards, Bengt Richter

Hello Bengt, On Sun, Jan 19, 2003 at 04:26:24PM -0800, Bengt Richter wrote:
Yes, you must have this data persist somewhere. In a .pyc-like file or any variant of the idea (like having one "global" database as Edward proposed, which would be user-specific to avoid security issues).
Yes and no. There are tons of issues that must be carefully planned for such a thing to be possible and secure, and it is probably not possible in a Unix-style OS (which is essentially C). I'll just drop the link http://tunes.org as an example of what I mean by re-planning an OS. A bientôt, Armin.

Hello Edward, On Sat, Jan 18, 2003 at 10:40:33PM -0600, Edward K. Ream wrote:
This is an important theoretical result: the project can never fail due to the cost of compilation.
Yes, with good accounting algorithms we can save and restore some of the already-done work. The current Psyco is far from being able to do this cleanly, so I never really thought about it in depth, but it is certainly a desirable feature for a cleaner Psyco (like what I want to do in this project). In all cases I still think that there is some use for a fast-and-dirty compiler mode; for example, when compiling dynamically-constructed code that will change all the time.
Maybe that vm could be the intermediate code list of gcc? If compilation speed isn't important, the "compiler" would simply be the front end for gcc.
I am a bit afraid of what would have to be done to interface the "core" of GCC with Psyco, but this is mainly because I never digged to deeply into GCC. I am sure it can be done, and it would certainly be a great thing. I am sure that your experience in this domain would be most profitable :-)
provided that you (or rather Guido) is willing to expand the notion of the "byte code".
Yes, I think that we should try to unify the idea of function object vs code object in Python with other similar ideas used internally in CPython. For example, built-in function objects have a pointer to a PyMethodDef structure --- we find again the distinction between the "callable front-end" object and the "implementation-description" structure. Ideally, we should have only one all-purpose "function" object type, which holds things like argument names and default values, and any number of "implementation" object types, of which Python code objects would be one example, and PyMethodDef-like objects another. This would let us add other ways to implement functions, like Psyco-emitted machine code objects. Sometimes I wonder whether I should raise the question in python-dev. It seems to me that it helps in various places, e.g. in the help() mecanism which currently cannot guess the argument list for built-in functions. Well, I cannot see how to make it 100% compatible with existing code... Armin

At 10:50 2003-01-19 -0800, Armin Rigo wrote:
[...]
ISTM there is a general concept of dynamic representation management (Python can give DRM a new meaning ;-) coming out of the mist. In C, type vs representation is almost 1:1 (i.e., type names identify memory layouts with bits and words etc), but with Python and psyco there are multiple ways of physically representing the same abstract entity. I'd like to push for separating the concepts of type and representation better in discussion. What I'm getting at is separating "representation-type" from "abstraction-type". E.g., a Python object pointer in C may implicitly encode an abstract tuple of (type, id, value) or class PyPtr: __slots__ = ['oType', 'oId', 'oValue'] and we can discuss separately how to pack the info of the abstraction into a 32-bit word with huffman tricks and addressing of type-implying allocation arenas etc., or whatever. But maybe there's another level. I'm wondering whether the most primitive object representation should have a slot for an indication of what kind of representation is being used. E.g., class Primo: __slots__ = [ 'abtraction_type', # might say integer, but not int vs long vs bignum 'entity_id', # identifies abstract instance being represented 'representation_type', # implies a representation_interpreter, maybe CPU 'representation_data' # suitable for representation_interpreter to find it ] In other words, multiple Primo instances with the same entity_id could be specifying multiple abstractly equivalent but concretely different representations of the same object, e.g., a particular integer being represented in various ways, maybe even as machine code to move a particular represention from one place to another. Entity_id might be encoded as a chain of pointers through sibling Primo instances representing the same abstract instance entity. I think there can also be representation_types that are partial representations, something like a database view, or a C++ pointer cast to refer to data members of a base class part of an instance representation. This brings up relationships of multiple representations when they diverge from 1:1 representations of the full abstract info. Any full and valid representation is abstractly equivalent to another, but if one representation is updated, siblings must be invalidated or re-validated (some "view" might not be affected, another representation might be easy and worthwhile to update, like a small part of a complex object, but others might be cheaper to mark for disposal or lazy update). ISTM Python involves multiple concrete representations of types while also trying to unify the abstract aspects, and Psyco only adds to the need for a clear way to speak of Dynamic Representation Management (not Digital Rights Management ;-) issues. I hope I have triggered some useful thoughts, even though so far I only know of Psyco indirectly from these discussions (will try to correct that sometime soon ;-) Regards, Bengt Richter

Hello Bengt, On Sun, Jan 19, 2003 at 04:02:05PM -0800, Bengt Richter wrote:
Yes, I also think it is an important point. I'm not sure we should already tackle the issue of having multiple representations of the *same* object, though. I was rather thinking about having several available implementations, but each object only implemented with one of them at a time. Occasionally switching to another representation is the next step. Managing several concurrent representations is yet another, more difficult step I guess :-) A bientôt, Armin.

Hello Edward, On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote:
No: psyco_compatible() is only called at compile-time. When a "psycotic" function is called by regular Python code, we just jump to machine code that starts at the beginning of the function with no particular assumption about the arguments; it just receive PyObject* pointers. Only when something more about a given argument is needed (say its type) will this extra information be asked for. The corresponding machine code is very fast in the common case: it loads the type, compares it with the most common type found at this place, and if it matches, runs on. So in the common case, we only have one type check per needed argument. Given def my_function(a,b,c): return a+b+c the emitted machine code looks like what you would obtain by compiling this: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { int r1, r2, r3; if (a->ob_type != &PyInt_Type) goto uncommon_case; if (b->ob_type != &PyInt_Type) goto uncommon_case; if (c->ob_type != &PyInt_Type) goto uncommon_case; r1 = ((PyIntObject*) a)->ob_ival; r2 = ((PyIntObject*) b)->ob_ival; r3 = ((PyIntObject*) c)->ob_ival; return PyInt_FromLong(r1+r2+r3); } Only when a new, not-already-seen type appears does it follow the "uncommon_case" branch. This triggers more compilation, i.e. emission of more machine code. During this emission, we make numerous calls to psyco_compatible() to see if we have reached a state that we have already seen, and which subsequently corresponds to already-emitted machine code; if it does, we emit a jump to this old code. This is the purpose of psyco_compatible(). I must mention that in the above example, the nice-looking C version is only arrived at after several steps of execution mixed with further compilation. The first version is: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { goto uncommon_case; /* need a->ob_type */ } Then when the function is first called with a integer in 'a', it becomes: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { if (a->ob_type != &PyInt_Type) goto uncommon_case; goto uncommon_case; /* need b->ob_type */ } and so on.
2. True or false: the call to psyco_compatible would be equivalent to runtime code that discovers special values of certain particular variables.
See above.
3. True or false: adding more state information to psyco (in order to discover more runtime values) will slow down psyco_compatible.
This is true. The more you run-time values you want to "discover" (I say "promote to compile-time"), the more versions of the same code you will get, and the slower psyco_compatible() will be (further slowing down compilation, but not execution proper, as seen above).
4. Are these the most important questions to ask about psyco? If not, what _are_ the key questions?
Hard to say! I like to mention the "lazy" values ("virtual-time"). These are the key to high-level optimizations in Psyco. In the above example you might have noticed that the Python interpreter must build and free an intermediate integer object for "a+b" when computing "a+b+c", while the C version I showed does not. Psyco does this by considering the intermediate PyObject* pointer as lazy. As long as it is not needed, no call to PyInt_FromLong() is written; only the value "r1+r2" is computed. Similarily, in "a+b", if both operands are strings, the result is a lazy string which is implemented as a lazy list "[a,b]". Concatenating more strings turns the list into a real Python list, but the resulting string itself is still lazy. This is how Psyco end up automatically translating things like s = '' for t in xxx: s += t into something like lst = [] for t in xxx: lst.append(t) s = ''.join(lst) I hope that these examples cast some light on Psyco. I realize that this could distract people from the current goals of this project, and I apologize for that. We should discuss e.g. "how restricted" the language we use for Python-in-Python should be... A bientot, Armin.

From: "Armin Rigo" <arigo@tunes.org>
at least for method dispatching/lookup some kind of sampling in the spirit of polymorphic inline caches can help distinguish whether there are a one to few relevant cases that are worth to specialize for, or e.g. implementing dispatch with simply a monomorphic inline cache has a more reasonable (especially in space) price.

Armin Rigo wrote: ...
[snipped all he good rest] Just a little comment. The above is what I like so much about the Psyco ideas. Now consider the huge eval_code function, wih its specializations in order to make operations on integers very fast, for example. With Psyco, these are no longer necessary, since Psyco will find them by itself and create code like the above from alone. As another point, when re-implementing the Python core objects in Python, there are many internal functions which are called by the interpreter, only. The datatypes pssed to those functions will be almost always the same, and since the functions aren't exposed otherwise, the first time they are called will create their final version, and the uncommon_case can be dropped completely. We just need to "seed" them with appropriate primitive data types, and the whole rest can be deduced with ease. That's what I eagerly want to try and to see happen :-)
Sorry, I couldn't resist it. I will start to ask some questions in a different thread. ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Any interest in getting rid of eval() altogether? phone:306.653.4747 fax:306.653.4774 http://www.zu.com

I mean the builtin. I didn't realise there was an eval() in C (I thought the function name(s) was different). On Monday, January 20, 2003, at 01:29 PM, holger krekel wrote:
-- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com

The only reason I bring it up, and I'm not really in any position to be bringing things up, is that it seems to me that this dynamic metacompiling stuff could be a real pain for making things fast and optimized. Most fast languages don't have eval() and that may be part of the reason they are fast. I'm sure Guido could jump in with a wonderful reason why eval() is great but I think if it disappeared no one would miss it, especially if it let Python compile to machine code. In fact if that was the price of enabling Python to compile to machine code I *guarantee* no one would miss it. On Monday, January 20, 2003, at 02:03 PM, holger krekel wrote:
-- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com

Hello Armin,
Many thanks for this most interesting and informative reply. It clears up a lot of my questions. I feel much more free to focus on the big picture.
Yes. I have been focusing on an "accounting" question: how often does the compiler run? If the compiler starts from scratch every time a program is run, then I gather from your example that the compiler will be called once for every type of very argument for every executed function _every time the program runs_. Perhaps you are assuming that the gains from compiling will be so large that it doesn't matter how often the compiler runs. Yesturday I realized that it _doesn't matter_ whether this assumption is true or not. Indeed, suppose that we expand the notion of what "byte code" is to include information generated by the compiler: compiled machine code, statistics, requests for further optimizations, whatever. The compiler could rewrite the byte code in order to avoid work the next time the program runs. Now the compiler runs less often: using exactly the same scheme as before the compiler will run once for every type of very argument for every executed function _every time the source code changes_. This means that no matter how slowly the compiler runs the _amortized_ runtime cost of the compiler can be made to be asymptotically zero! This is an important theoretical result: the project can never fail due to the cost of compilation. This result might also allow us to expand our notion of what is possible in the implementation. You are free to consider any kind of algorithm at all, no matter how expansive. For instance, there was some discussion on another thread of a minimal VM for portability. Maybe that vm could be the intermediate code list of gcc? If compilation speed isn't important, the "compiler" would simply be the front end for gcc. We would only need to modify the actual emitters in gcc to output to the "byte code". We get all the good work of the gcc code generators for free. Retargeting psyco would be trivial. These are not proposals for implementation, and certainly not requests that you modify what you are planning to do in any way. Rather, they are "safety proofs" that we need not be concerned about compilation speed _at all_, provided that you (or rather Guido) is willing to expand the notion of the "byte code". This could be done whenever convenient, or never. The point is that my worries about the cost of compilation were unfounded. Compilation cost can never be a "gotcha"; a pressure-relief value is always available. Perhaps this has always been obvious to you; it wasn't at all clear to me until yesterday. Edward

At 22:40 2003-01-18 -0600, Edward K. Ream wrote:
Well, if most everything is written in python, with all the libraries etc., I think there still is some accounting to do. I.e., library modules won't change much after having been exercised a bit. Will this keep updating cached info in .pyc (or maybe new .pyp) files? (BTW, IWT that makes for eventual permissions issues, if it's shared libraries. Do you just get per-use caches, etc.?) IMO there has to be some way of not rebuilding the world a lot, even if it's fast ;-) I'm also picking this place to re-introduce the related "checkpointing" idea, namely some call into a builtin that can act like a yield and save all the state that the compiler/psyco etc have worked up. Perhaps some kind of .pyk for (python checkpoint) file that could resume from where the checkpoint call was. I believe it can be done if the interpreter stack(s) is/are able to be encapsulated and a little restart info can be stored statically and the machine stack can unwind totally out of main and the C runtime exit, so that coming back into C main everyting can be picked up again. I don't want to belabor it, but just mention it as something to consider, in case it becomes easy when you are redesigning the VM and its environment. Obviously there has to be some restrictions on state, but I think if we could wind up with fast-load application images in the future because you kept this in mind in the beginning, it could be a benefit. Regards, Bengt Richter BTW[OT], sorry about the justification. Now I can't get it back without retyping or writing a re-spacing ragged wrapper.

Good question. Here is something I sent privately to Christian: [starts] To run a program: [using .pyp files] 1. Load the byte code, performing any queued requests for optimizations using stored data. In general, there will be no such requests after the first few runs. Obviously, changing the Python source code throws out some or all of this intermediate data. 2. Run the code, doing some optimizations immediately, possibly requesting other optimizations to be done later, and storing any useful data in the "extended byte code". At first the code will be intepreted/git'ed. After a few runs there will be nothing left but globally optimized machine code. It doesn't get any better than this. In short, the code gets faster the more it is executed. This approach should make bootstrapping easy. You could even start with an interp that just queues up optimization requests for the next load time! [ends] So I am thinking that the only time this machine code in the byte code (.pyp file) changes is if the jit/interpreter/psyco/whatever-you-call-it sees an object with a type that it has never seen before. After a very few runs (of any particular .pyp file) the cached data becomes nothing but machine code (with branches to uncommon_case that never get taken). I see the situation as being similar to a peephole optimizer: it takes only 2 or 3 iterations to perform all possible optimizations. Since the Python code of libraries and other "system" code never changes (or hardly ever changes), we should be ok. Edward -------------------------------------------------------------------- Edward K. Ream email: edream@tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html --------------------------------------------------------------------

Hello Bengt, On Sat, Jan 18, 2003 at 11:25:52PM -0800, Bengt Richter wrote:
This might probably be done without Psyco, and would certainly be a nice thing to have. Note that a good Psyco could remove any need for it: most initialization code could theoretically be specialized into something that just creates the necessary data structures without executing any code at all. Sometimes I like to point out that if our OS were written in a high-level language with built-in specializers, they would boot in no more than the time it takes to do the actual I/O that occurs when booting (mainly displaying the login screen and waiting for mouse, keyboard and network input) --- everything else is internal state and can be done lazily. A bientot, Armin.

Hello Armin, At 10:50 2003-01-19 -0800, Armin Rigo wrote:
There seems to be something I missed. Could you clarify how such specialized versions persist so they don't have to be redone? I.e., how do you get from an original .py source-only representation to the specialized form, and how does the latter come to exist? I.e., is this a new form of incrementally updated .pyc?
If this means dynamic incremental revisions of system files, it must be a whole new class of security issues to nail down, or am I misconstruing? Regards, Bengt Richter

Hello Bengt, On Sun, Jan 19, 2003 at 04:26:24PM -0800, Bengt Richter wrote:
Yes, you must have this data persist somewhere. In a .pyc-like file or any variant of the idea (like having one "global" database as Edward proposed, which would be user-specific to avoid security issues).
Yes and no. There are tons of issues that must be carefully planned for such a thing to be possible and secure, and it is probably not possible in a Unix-style OS (which is essentially C). I'll just drop the link http://tunes.org as an example of what I mean by re-planning an OS. A bientôt, Armin.

Hello Edward, On Sat, Jan 18, 2003 at 10:40:33PM -0600, Edward K. Ream wrote:
This is an important theoretical result: the project can never fail due to the cost of compilation.
Yes, with good accounting algorithms we can save and restore some of the already-done work. The current Psyco is far from being able to do this cleanly, so I never really thought about it in depth, but it is certainly a desirable feature for a cleaner Psyco (like what I want to do in this project). In all cases I still think that there is some use for a fast-and-dirty compiler mode; for example, when compiling dynamically-constructed code that will change all the time.
Maybe that vm could be the intermediate code list of gcc? If compilation speed isn't important, the "compiler" would simply be the front end for gcc.
I am a bit afraid of what would have to be done to interface the "core" of GCC with Psyco, but this is mainly because I never digged to deeply into GCC. I am sure it can be done, and it would certainly be a great thing. I am sure that your experience in this domain would be most profitable :-)
provided that you (or rather Guido) is willing to expand the notion of the "byte code".
Yes, I think that we should try to unify the idea of function object vs code object in Python with other similar ideas used internally in CPython. For example, built-in function objects have a pointer to a PyMethodDef structure --- we find again the distinction between the "callable front-end" object and the "implementation-description" structure. Ideally, we should have only one all-purpose "function" object type, which holds things like argument names and default values, and any number of "implementation" object types, of which Python code objects would be one example, and PyMethodDef-like objects another. This would let us add other ways to implement functions, like Psyco-emitted machine code objects. Sometimes I wonder whether I should raise the question in python-dev. It seems to me that it helps in various places, e.g. in the help() mecanism which currently cannot guess the argument list for built-in functions. Well, I cannot see how to make it 100% compatible with existing code... Armin

At 10:50 2003-01-19 -0800, Armin Rigo wrote:
[...]
ISTM there is a general concept of dynamic representation management (Python can give DRM a new meaning ;-) coming out of the mist. In C, type vs representation is almost 1:1 (i.e., type names identify memory layouts with bits and words etc), but with Python and psyco there are multiple ways of physically representing the same abstract entity. I'd like to push for separating the concepts of type and representation better in discussion. What I'm getting at is separating "representation-type" from "abstraction-type". E.g., a Python object pointer in C may implicitly encode an abstract tuple of (type, id, value) or class PyPtr: __slots__ = ['oType', 'oId', 'oValue'] and we can discuss separately how to pack the info of the abstraction into a 32-bit word with huffman tricks and addressing of type-implying allocation arenas etc., or whatever. But maybe there's another level. I'm wondering whether the most primitive object representation should have a slot for an indication of what kind of representation is being used. E.g., class Primo: __slots__ = [ 'abtraction_type', # might say integer, but not int vs long vs bignum 'entity_id', # identifies abstract instance being represented 'representation_type', # implies a representation_interpreter, maybe CPU 'representation_data' # suitable for representation_interpreter to find it ] In other words, multiple Primo instances with the same entity_id could be specifying multiple abstractly equivalent but concretely different representations of the same object, e.g., a particular integer being represented in various ways, maybe even as machine code to move a particular represention from one place to another. Entity_id might be encoded as a chain of pointers through sibling Primo instances representing the same abstract instance entity. I think there can also be representation_types that are partial representations, something like a database view, or a C++ pointer cast to refer to data members of a base class part of an instance representation. This brings up relationships of multiple representations when they diverge from 1:1 representations of the full abstract info. Any full and valid representation is abstractly equivalent to another, but if one representation is updated, siblings must be invalidated or re-validated (some "view" might not be affected, another representation might be easy and worthwhile to update, like a small part of a complex object, but others might be cheaper to mark for disposal or lazy update). ISTM Python involves multiple concrete representations of types while also trying to unify the abstract aspects, and Psyco only adds to the need for a clear way to speak of Dynamic Representation Management (not Digital Rights Management ;-) issues. I hope I have triggered some useful thoughts, even though so far I only know of Psyco indirectly from these discussions (will try to correct that sometime soon ;-) Regards, Bengt Richter

Hello Bengt, On Sun, Jan 19, 2003 at 04:02:05PM -0800, Bengt Richter wrote:
Yes, I also think it is an important point. I'm not sure we should already tackle the issue of having multiple representations of the *same* object, though. I was rather thinking about having several available implementations, but each object only implemented with one of them at a time. Occasionally switching to another representation is the next step. Managing several concurrent representations is yet another, more difficult step I guess :-) A bientôt, Armin.
participants (9)
-
Armin Rigo
-
Bengt Richter
-
Christian Tismer
-
David Ascher
-
Edward K. Ream
-
holger krekel
-
Nathan Heagy
-
Samuele Pedroni
-
tanzer@swing.co.at