Re: [pypy-dev] Objects and types in the stdobjspace
Armin Rigo <arigo@tunes.org> wrote:
On Sat, Jun 07, 2003 at 08:51:51AM +0200, holger krekel wrote:
- how the new test-machinery works so i can make sure that i don't break big stuff while trying to fix/play around
pypy/testall.py runs all the tests, using the object space specified in the environment variable OBJSPACE -- which must be exactly pypy.objspace.std.StdObjSpace to use the standard object space; the >trivial object space is used otherwise.
pypy/testwice.py is a hack around the previous one to run all the tests in both object spaces.
("previous" being the pypy/testall.py file, not any pre-sprint test hook.) ... and pypy/testcts.py is a rather clever piece of code to run tests on StdObjSpace, keep track of the results, and report any difference from run to run.
The individual .../test/test_xxx.py files are still runnable. There is a testsupport.py file in all test directories for glueing purposes; running it directly should execute all the tests in that directory.
I've been playing around with it over the weekend, and I feel things (testing wise) need to be clarified slightly. First, I think we should move all pypy-as-a-whole testing related stuff into an approriate subdirectory ("pypy/testing" or some such. As it is now, pieces are scattered in pypy and pypy/interpreter. Secondly, the testing framework as written will only run tests at the interpreter level. Some tests (such as test_exceptcomp and test_exec) should be run at the application level. Additionally, there is no provision for running the CPython regression tests automatically within the current testing framework. The thought occured to me if it would not be easiest to "appropriate" the CPython regrtest.py framework for our purposes. As best I can tell, it's written in a general fashion already, so "all" we would need to do is get it to recognize when to run tests under interpreter level and when to run tests under application level. (Which could potentially be accomplished by a naming convention.)
- any "entry points" other than interactive.py?
There is main.py in the same directory. Its purpose is that 'python >main.py script-and-options' should be the same as 'pypy script-and-options' if we >had a working 'pypy' program. I guess that main.py should invoke >interactive.py when started with no argument (it doesn't right now).
Other problems, I assume (from code inspection, not from direct running) are that the commandline arguments are not passed to sys.argv, sys.path may or may not contain the directory of the script, and sys.modules['__main__'] is not set equal to the script module (only a problem, I think, when running unittest.main()). And in keeping with my rearranging mood, I think main.py should probably be placed in the pypy/ directory. -- I also think someone should give me <Dr. Evil> ONE BILLION DOLLARS </Dr. Evil> and melons should taste more like currants, so that shows you what *I* know.
- a recap of the current StdObjsSpace registration/multimethod mechanisms.
The W_XxxObject classes of the standard object space are, precisely, *implementations* of objects. This is not the same as the *type* of the object. It is possible to provide several implementations for the same >type (the user should not see the difference, though). Typical uses of this >would be to hide the int/long disctinction to the user altogether, or more interestingly to provide more efficient versions of the data structures >like string, list or dict when they become large.
FANTASTIC! An excellent way of looking at things. But I'm still a little hazy. Could you walk us through how this scheme would work with multiple implementations of the same type and specifics as to how this would be different from the (CPython) current implementation of types being the same as the implementation? Congrats to the spriters - it looks like quite a few good things have been accomplished. -Rocco __________________________________________________________________ McAfee VirusScan Online from the Netscape Network. Comprehensive protection for your entire computer. Get your free trial today! http://channels.netscape.com/ns/computing/mcafee/index.jsp?promo=393397 Get AOL Instant Messenger 5.1 free of charge. Download Now! http://aim.aol.com/aimnew/Aim/register.adp?promo=380455
roccomoretti@netscape.net (Rocco Moretti) writes:
Armin Rigo <arigo@tunes.org> wrote:
On Sat, Jun 07, 2003 at 08:51:51AM +0200, holger krekel wrote:
- how the new test-machinery works so i can make sure that i don't break big stuff while trying to fix/play around
pypy/testall.py runs all the tests, using the object space specified in the environment variable OBJSPACE -- which must be exactly pypy.objspace.std.StdObjSpace to use the standard object space; the >trivial object space is used otherwise.
pypy/testwice.py is a hack around the previous one to run all the tests in both object spaces.
("previous" being the pypy/testall.py file, not any pre-sprint test hook.)
... and pypy/testcts.py is a rather clever piece of code to run tests on StdObjSpace, keep track of the results, and report any difference from run to run.
'clever' is a strange way of spelling 'unfinished' :-) Ideally, this would run after every checkin (or so) and hassle the checker-in (or this list) if any tests broke. It would be nice to get this in place before or early on in the next sprint.
The individual .../test/test_xxx.py files are still runnable. There is a testsupport.py file in all test directories for glueing purposes; running it directly should execute all the tests in that directory.
I've been playing around with it over the weekend, and I feel things (testing wise) need to be clarified slightly. First, I think we should move all pypy-as-a-whole testing related stuff into an approriate subdirectory ("pypy/testing" or some such. As it is now, pieces are scattered in pypy and pypy/interpreter.
Probably, yes.
Secondly, the testing framework as written will only run tests at the interpreter level. Some tests (such as test_exceptcomp and test_exec) should be run at the application level.
Yes. I don't see how to easily and gracefully allow this.
Additionally, there is no provision for running the CPython regression tests automatically within the current testing framework.
The thought occured to me if it would not be easiest to "appropriate" the CPython regrtest.py framework for our purposes. As best I can tell, it's written in a general fashion already, so "all" we would need to do is get it to recognize when to run tests under interpreter level and when to run tests under application level. (Which could potentially be accomplished by a naming convention.)
Hmm, maybe. regrtest has become somewhat crufty over the years, I don't think we need all of it. In a way, I'd prefer to keep things within the unittest framework if possible (although working closely with it for the first time has made its Java heritage unpleasantly clear...). We also need a LOT more tests!
- any "entry points" other than interactive.py?
There is main.py in the same directory. Its purpose is that 'python >main.py script-and-options' should be the same as 'pypy script-and-options' if we >had a working 'pypy' program. I guess that main.py should invoke >interactive.py when started with no argument (it doesn't right now).
Other problems, I assume (from code inspection, not from direct running) are that the commandline arguments are not passed to sys.argv, sys.path may or may not contain the directory of the script, and sys.modules['__main__'] is not set equal to the script module (only a problem, I think, when running unittest.main()).
Yes. It's very much unfinished -- and actually it would make a good project for someone who doesn't get the ins and outs of the standard object space to move this further to completion...
And in keeping with my rearranging mood, I think main.py should probably be placed in the pypy/ directory.
Hmm. Maybe, or maybe the executable that people actually run will look like this: #!/usr/bin/python from pypy.interpreter import main main.main() [...]
Congrats to the spriters - it looks like quite a few good things have been accomplished.
Thanks, I think we really moved things forward. Are you able to come to the next sprint? Cheers, M. -- 48. The best book on programming for the layman is "Alice in Wonderland"; but that's because it's the best book on anything for the layman. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
Hello Rocco, On Sun, Jun 08, 2003 at 11:10:52PM -0400, Rocco Moretti wrote:
But I'm still a little hazy. Could you walk us through how this scheme would work with multiple implementations of the same type and specifics as to how this would be different from the (CPython) current implementation of types being the same as the implementation?
Ok. In CPython the type completely defines the implementation, via the PyTypeObject structure (tp_size and all the function pointers). All objects have a type pointer (ob_type), which it a very fast way to retrieve both the type of the object and its implementation (as it is the same thing). Ultimately this PyObject structure is something that we will want to have in *some* versions of PyPy, so that we can be compatible with CPython's extension modules, at least. I'm confident that we'll be able to do that automatically based on the current (different) way the stdobjspace works. This is what we did with multimethods: during the latest sprint we could tie them back to Python's __add__&co methods, but internally we use multimethods all the way simply because they are more natural in our context. The "natural" way to look at the object implementation of the stdobjspace is that there is only a finite and reasonably small number of different implementations defined (unlike types, of which the user can create as many as he wishes). The translator-to-C program can thus implement a wrapped object with a struct that has some (arbitrary) tag to distinguish between implementations. In CPython, the tag is the ob_type field. In our case we are free to choose whatever we want. For example it could be a small integer: 0=W_IntObject, 1=W_ListObject, 2=W_TupleObject, and so on. It can be completely arbitrary. It could also be a pointer to some data describing the implementation. But using a small integer makes multimethod dispatch extremely fast : to dispatch on objects 'a' and 'b', read 'a->tag' and 'b->tag' and use them as indices in a (N by N) table of function pointers. It is quite simpler and faster than CPython's way of playing around with each object's type's tp_number->nb_add. That's for the motivation. Now several implementations for the same type can coexist, provided we give some heuristics to select between them. For example, suppose we have two string implementations, a W_StringObject and a W_ConcatenatedStringObject. Then the concatenation of two W_StringObject should check if the resulting string would grow larger than some threshold; if so, instead of building and returning a W_StringObject it builds and returns a W_ConcatenatedStringObject, enabling whose new algorithms to carry on with the manipulation of what the user still sees as a plain string. Choice of implementation can also be done when the object is initially built. For example, imagine we had a W_SmallIntObject implementation that can store an integer of up to 30 bits by abusing a pointer field (e.g. by setting the last two bits to an odd value, and assuming real pointers are never odd). There should be (there isn't yet) an implementation of inttype_new() in inttype.py (for calls to 'int(obj)'). This one should examine the numeric value it should build and depending on whether it is small enough to fit 30 bits or not, it can build a W_SmallIntObject or a W_IntObject. Actually, 'space.wrap(someinteger)' could also invoke this mechanism. A bientôt, Armin.
participants (3)
-
Armin Rigo
-
Michael Hudson
-
roccomoretti@netscape.net