Re: [pypy-dev] Re: Base Object library (was: stdobjspace status)

Hello Stephan, On Thu, Feb 27, 2003 at 05:23:44PM +0100, Stephan Diehl wrote:
Damn. I don't know. If we don't include any 'long' type among our allowed RPython types (and we probably don't), then we will have to rethink the way code objects are currently completely put into interpreter-space level. This cannot work if this code object contains constants of unknown types! So it seems that even code objects must be object-space dependent. They would be compiled by the application-level 'compiler' package inside of a given object space, and not be 'extractible' from there. The interpreter would only unwrap the bytecode string, but not the tuple of constants. If on the other hand we would prefer completely portable code objects, then they must contain "portable constant literals", including longs, and wrap() must support at least these constants. This could be nice, too, because we could then have object-space-independent bytecode repositories (like .pyc files). But then we must include in RPython the notion of longs -- at least their existence, not necessarily any specific operation on them. The same about complex numbers. Tough choice. Armin

Hallo Armin, On Thursday 27 February 2003 18:01, you wrote:
Maybe, then RPython was not a very good idea in the first place.
Would it help if the bytecode just holds the string representation of an object? I guess the lexer gives down the string plus type information. maybe somthing like ('long','12345L'). If this were taken as it is into the bytecode, the interpreter then could run the instantiation on the fly. But then, the question is, if the ObjSpaces we are talking about could live in Application Space? The interpreter just requires some Object types to be present in order to function. In order to be used as an internal type, the XXXObjects just have to comply to some interface and know how to create oneselfs from a string (and give the compiler some regexp so it can be known). (Hmm, this is probably a little bit too off target) Anyway, this stuff must not be decided now.
Tough choice.
Armin
Cheers Stephan

Hi Armin,
Maybe I'm ignorant -- didn't read all of the implementation, yet, but why is this a problem? I don't think that RPython has anything to do with Python constants and Python code objects, and of course I think that creating code objects should happen in user space. Then, the problem vanishes.
I can't follow. Maybe the physical layout of structures is different, depending of the object space, but from the Python view, they are all the same. If you marshal them, they are the same. Now think of marshalled code objects, which you unmarshal from the application-level. All the marshalled contants will of course be turned into objects in the current application space, however that is implemented. Do we really use code objects from the "C" level Python? This was a temporary hack, as I understood. Of course cou can use them as a template, but I'm assuming that we would transform them for the "upper" world, before executing.
I still see no reason why a long needs to exist in RPython at all. Please, give me a hint :-) ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

[Christian Tismer Thu, Feb 27, 2003 at 08:22:31PM +0100]
Agreed. Creating code objects at application-level needs some tweaks, though (see my other post, same thread).
Marshaled (pypy-) code objects would need to serialize objectspace dependent types. Maybe we should require all objectspaces to implement interoperable serialization. This would also provide a means to transfer objects from one objectspace to another (and to CPython). However, I am a bit uneasy about tying pypy too deep to CPython's codeobject layout. We are likely to want to extend/modify it the future anyway. So maybe we shouldn't take CPython-comptability down to the code object . It's the "heart of gold", produced by the compiler and driving the interpreter. These are areas where we want maximum flexibility and thus shouldn't think too much from CPython's code object which really is more of an implementation detail than anything else. just my 2ec, holger

holger krekel wrote: [tismer]
Saw that post -- sure, I do think the same. ...
Marshaled (pypy-) code objects would need to serialize objectspace dependent types.
No, I don't think so. Why should they serialize objectspace dependant types? Al the ncessary types are available through userspace, however they are implemented in objectspace. So they should serialize their userspace equivalent, nothing else. exactly that can recreated from userspace, however the objectspace looks like.
The common external interface is how objects are pickles/marshalled. This is certainly independant from the implementation, and also the reason why I took it as an example to clarify matters.
If we want to borrow code objects and run them in our interpreter, as stated from the beginning, we have to be able to swallow them. This is an absolute must. Requiring compatibility with our objectspace is an absolute mustnot. We need to read the sored structure and create our equivalent object, whatever that is.
lease have a look at the code object from the implementation's view. There is nothing specific which should not be in our implementation. If we have problems to marshal our code objects, since they are object space depandent, then we have very wrong concept about object space. "Object space" is an implementation detail, IMHO. Python's code serialization does not have this problem. Why should we? ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

holger krekel wrote: ...
I think this makes very much sense, as far as we stick with the opcodes as well. It is possible to change that later, or there are situations where we compile away and don't have such a thing like code objects. But for now having them compatible is essential.
There is one simple interface: The fields of the code object that get marshalled. This is not necesssarily related to the internal layout of this object. Think of how Corba marshals objects, and you transfer them between, say, C and Java.
I think it is worthwhile to be able to run a .pyc file like in CPython, at least until we can compile everything alone. I agree that we might do lots of extensions, later, but why should it be a problem to produce a compatible one for the moment? I don't see that we loose something. As said, code objects are like bytecode, they have a given implementation and a defined external representation. How we transform them internally is a completely different story. ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

[stephan]
It's the compiler and not the parser that turns "12345L" into a Python object. It executes "eval" on atom_number and atom_string symbols (see compiler/transformer.py in the standard lib). [armin]
In CPython, the compiler package just uses the C-level "eval" to implicitely get ready-made "constant" Python-objects. Thus it avoids the problem of computing the constants by effectively handing it to its C-equivalent "compile.c" . If we want to have a pure python compiler then we need to break the cycle (computing a constant requires computing a constant). So for PyPy, the compiler package cannot use eval. It should make a try-except dance trying to apply "int", "float", "long" to atom_numbers and "str", "unicode" to atom_strings. These are builtin types we have to implement anyway and they will dispatch to the appropriate objectspace internally. Thus the compiler could stay clean of infinite recursion and provide already wrapped "constant" objects. So all in all, the compiler package is the right place to fix the "interpreter doesn't want to know types of constants" problem. Maybe the above sketched fix might even find its way into Python-2.3 as it also speeds up and un-tricks the compiling process :-) greetings, holger

Hallo Armin, On Thursday 27 February 2003 18:01, you wrote:
Maybe, then RPython was not a very good idea in the first place.
Would it help if the bytecode just holds the string representation of an object? I guess the lexer gives down the string plus type information. maybe somthing like ('long','12345L'). If this were taken as it is into the bytecode, the interpreter then could run the instantiation on the fly. But then, the question is, if the ObjSpaces we are talking about could live in Application Space? The interpreter just requires some Object types to be present in order to function. In order to be used as an internal type, the XXXObjects just have to comply to some interface and know how to create oneselfs from a string (and give the compiler some regexp so it can be known). (Hmm, this is probably a little bit too off target) Anyway, this stuff must not be decided now.
Tough choice.
Armin
Cheers Stephan

Hi Armin,
Maybe I'm ignorant -- didn't read all of the implementation, yet, but why is this a problem? I don't think that RPython has anything to do with Python constants and Python code objects, and of course I think that creating code objects should happen in user space. Then, the problem vanishes.
I can't follow. Maybe the physical layout of structures is different, depending of the object space, but from the Python view, they are all the same. If you marshal them, they are the same. Now think of marshalled code objects, which you unmarshal from the application-level. All the marshalled contants will of course be turned into objects in the current application space, however that is implemented. Do we really use code objects from the "C" level Python? This was a temporary hack, as I understood. Of course cou can use them as a template, but I'm assuming that we would transform them for the "upper" world, before executing.
I still see no reason why a long needs to exist in RPython at all. Please, give me a hint :-) ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

[Christian Tismer Thu, Feb 27, 2003 at 08:22:31PM +0100]
Agreed. Creating code objects at application-level needs some tweaks, though (see my other post, same thread).
Marshaled (pypy-) code objects would need to serialize objectspace dependent types. Maybe we should require all objectspaces to implement interoperable serialization. This would also provide a means to transfer objects from one objectspace to another (and to CPython). However, I am a bit uneasy about tying pypy too deep to CPython's codeobject layout. We are likely to want to extend/modify it the future anyway. So maybe we shouldn't take CPython-comptability down to the code object . It's the "heart of gold", produced by the compiler and driving the interpreter. These are areas where we want maximum flexibility and thus shouldn't think too much from CPython's code object which really is more of an implementation detail than anything else. just my 2ec, holger

holger krekel wrote: [tismer]
Saw that post -- sure, I do think the same. ...
Marshaled (pypy-) code objects would need to serialize objectspace dependent types.
No, I don't think so. Why should they serialize objectspace dependant types? Al the ncessary types are available through userspace, however they are implemented in objectspace. So they should serialize their userspace equivalent, nothing else. exactly that can recreated from userspace, however the objectspace looks like.
The common external interface is how objects are pickles/marshalled. This is certainly independant from the implementation, and also the reason why I took it as an example to clarify matters.
If we want to borrow code objects and run them in our interpreter, as stated from the beginning, we have to be able to swallow them. This is an absolute must. Requiring compatibility with our objectspace is an absolute mustnot. We need to read the sored structure and create our equivalent object, whatever that is.
lease have a look at the code object from the implementation's view. There is nothing specific which should not be in our implementation. If we have problems to marshal our code objects, since they are object space depandent, then we have very wrong concept about object space. "Object space" is an implementation detail, IMHO. Python's code serialization does not have this problem. Why should we? ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

holger krekel wrote: ...
I think this makes very much sense, as far as we stick with the opcodes as well. It is possible to change that later, or there are situations where we compile away and don't have such a thing like code objects. But for now having them compatible is essential.
There is one simple interface: The fields of the code object that get marshalled. This is not necesssarily related to the internal layout of this object. Think of how Corba marshals objects, and you transfer them between, say, C and Java.
I think it is worthwhile to be able to run a .pyc file like in CPython, at least until we can compile everything alone. I agree that we might do lots of extensions, later, but why should it be a problem to produce a compatible one for the moment? I don't see that we loose something. As said, code objects are like bytecode, they have a given implementation and a defined external representation. How we transform them internally is a completely different story. ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

[stephan]
It's the compiler and not the parser that turns "12345L" into a Python object. It executes "eval" on atom_number and atom_string symbols (see compiler/transformer.py in the standard lib). [armin]
In CPython, the compiler package just uses the C-level "eval" to implicitely get ready-made "constant" Python-objects. Thus it avoids the problem of computing the constants by effectively handing it to its C-equivalent "compile.c" . If we want to have a pure python compiler then we need to break the cycle (computing a constant requires computing a constant). So for PyPy, the compiler package cannot use eval. It should make a try-except dance trying to apply "int", "float", "long" to atom_numbers and "str", "unicode" to atom_strings. These are builtin types we have to implement anyway and they will dispatch to the appropriate objectspace internally. Thus the compiler could stay clean of infinite recursion and provide already wrapped "constant" objects. So all in all, the compiler package is the right place to fix the "interpreter doesn't want to know types of constants" problem. Maybe the above sketched fix might even find its way into Python-2.3 as it also speeds up and un-tricks the compiling process :-) greetings, holger
participants (4)
-
Armin Rigo
-
Christian Tismer
-
holger krekel
-
Stephan Diehl