the new shining Builtinrefactor branch :-)
![](https://secure.gravatar.com/avatar/5b37e6b4ac97453e4ba9dba37954cf79.jpg?s=120&d=mm&r=g)
From the user's point of view (i.e. app-level outside the definition of the module), an expression like 'sys.displayhook' actually gets you a *bound method* of the 'sys' instance. This is true even when accessing 'displayhook' with other syntaxes like 'from sys import displayhook'. This is also true for
Hello everybody ! The 'builtinrefactor' branch that Holger and Armin(*) have been working on is finally stable again. We are merging it back with the trunk. (This only affects the src/pypy subdirectory.) Take yourself a cup of tea and read on :-) (*) this is a join e-mail written in a common 'screen' session :-) The main change is a rewrite of the interactions between interpreter- and app-level code with the following design goals: - allow to freely "mix" app-level and interp-level code in one file (thus no need to have xxx_app.py files anymore) - implement functions and modules uniformly for all object spaces - provide mechanisms (interpreter/gateway.py) for transparently - calling app-level code from interp-level code - calling interp-level code from app-level code - accessing interp-level defined attributes from app-level - argument parsing is now at interpreter level to once and for all get rid of bootstrapping and debugging nightmares :-) - reorganized the Code and Frame classes and introduced subclasses of these to reduce redunant code dealing with those. Here is a more detailled description of the changes. Application-Level and Interpreter-Level Interaction --------------------------------------------------- Part one: invoke app-level code from interpreter level ------------------------------------------------------ We no longer need 'xxx_app.py' files for helpers for 'xxx.py'. An app-level helper function can be written in-line in the source. See for example the function normalize_exception() in interpreter/pyframe.py: def app_normalize_exception(etype, evalue): ...plain Python app-level code... normalize_exception = gateway.app2interp(app_normalize_exception) This makes 'normalize_exception' callable from interpreter-level as if it had been defined at interpreter-level with the following signature: def normalize_exception(space, w_etype, w_evalue): ... App-level helpers can also be used as methods. In pyopcode.py, app_prepare_exec() is the app-level definition of prepare_exec(), a method of the new class PyInterpFrame (more about it below). All these helpers can be called normally from the rest of the interpreter code. Global functions must be called with an extra first argument, 'space'. For methods the space is read from 'self.space'. All other arguments must be wrapped objects, and so is the result. If you have many 'app_*' functions you can register them "en masse" via a call to gateway.importall(globals()) # app_xxx() -> xxx() For other examples, see objspace/std/dicttype.py, which contains the code that used to be in the separae dictobject_app.py. Note how all the functions are defined with app_ prefixes, and then gateway.importall() is used to make the non-app_ interpreter-level-callable gateways, and finally register_all() registers the latter into the multimethod tables. Part two: make interpreter-level objects accessible from app-level ------------------------------------------------------------------ Conversely, 'gateway' contains code to allow interpreter-level functions to be visible from app-level. This can be done manually with gateway.interp2app(), but most of the time it is done for you by the ExtModule base class. See module/builtin.py: class __builtin__(ExtModule): ... Although such extension-modules are defined as classes there should only be one instance per object space. We are not only reusing the class-statement to provide a regular way of defining modules that contain code and attributes defined both at app-level and interp-level. Actually we need them to be a class because there might be several object spaces alive around there. Now the new module/builtin.py contains interp-level code like globals(), locals(), __import__()..., and app-level code like app_execfile(), app_range()... The ExtModule parent class will make sure all these appear on the instance, both at app-level and at interp-level. The app-level will see an ExtModule-instance as a plain module. It is better to start with the interpreter-level view to see how this works. You can call all these methods from *interpreter level*; for example, in 'def __import__(self, ...)' we do: self.execfile(space.wrap(filename), w_globals, w_locals) but execfile() is actually defined like so: def app_execfile(self, filename, glob=None, loc=None): ... and because this is an app-level definition all the arguments are of course wrapped (we are executing app_execfile() at app-level and the body of app_execfile refers to the parameters). Please note that app_* functions must always be called with wrapped arguments even though you don't see a 'w_' prefix. Of course you can also call 'execfile' from outside the instance methods, e.g. you can do space.builtin.execfile(space.wrap('somefile.py'), w_globals, w_locals) One more thing is interesting to discuss here. Methods like app_execfile() have a 'self' argument but what does it mean in the context of a module? The bottom line is that this provides an explicit way to access objects in the same module (instance): class somemodule(ExtModule): def app_y(self): ... def app_x(self): r = self.y() Calling 'y()' directly wouldn't usually work because this would try to access the globals of the CPython-module where 'somemodule' is defined. This is unrelated to the "globals namespace" of the app-level module, which is (behind the scene) implemented as the content of the 'self' instance. So app_x() above is in all respects a method of a class. The important thing to remember here is that app-level modules are implemented as class-instances at interpreter-level -- and the usual scoping rules apply. the __builtin__ module, so e.g. when you type 'len' into the pypy interpreter you see <pypy.interpreter.function.Method object at 0x4032c424> A bit weird, but it works as expected and we could later change the representation string :-) >>>> import sys >>>> sys.displayhook <pypy.interpreter.function.Method object at 0xe44f7cc> >>>> def f(): .... pass .... >>>> sys.f = f >>>> sys.f <pypy.interpreter.function.Function object at 0xe4ed3cc> >>>> The above should start to make sense if you really think of 'sys' as an instance :-) And this brings us to the next change. Functions (and friends) and Modules moved off objectspaces ----------------------------------------------------------- Function, Method, Generator, Module (and probably more) classes are now part of the interpreter; an object space is no longer responsible for providing them. It makes sense because these classes are straightforward structures anyway, and almost only created and used by the interpreter (like code objects, who where already in an interpreter-level class of their own). See interpreter/function.py and interpreter/module.py. An object space now needs to callback into interpreter-level objects to carry out operations on them. This allows the interpreter-level to control its own internal classes. For example in interpreter/pyframe.py the following method controls visibility of attributes of a Frame instance: def pypy_getattr(self, w_attr): attr = self.space.unwrap(w_attr) if attr == 'f_locals': return self.w_locals if attr == 'f_globals': return self.w_globals if attr == 'f_builtins': return self.w_builtins if attr == 'f_code': return self.space.wrap(self.code) raise OperationError(self.space.w_AttributeError, w_attr) (For reading attributes we'll probably design some better interface later :-) Note the preliminary interface to tell the object space how the interpreter-level objects should react to operations: if an interpreter-level class defines methods like e.g. pypy_getattr(), pypy_call(), pypy_iter(), etc then every objectspace is required to call those methods when it encounters a wrap()ed interpreter-level object for the getattr(), call(), iter() operations respectively. When the object space must wrap() one of these objects it does so in a special structure (for stdobjspace it is CPythonObject) which knows that it should look for these pypy_* method names. For example, Function objects have a pypy_call() method that is called whenever the space.call() operation is issued with a wrapped Function as first argument. At some point all these wrappable classes should have a pypy_getattr() as well and we might eventually allow only subclasses of baseobjspace.Wrappable to be wrap()ed. The new interface to wrap()&co that we are discussing in pypy-dev is not implemented yet, but essentially the one remaining usage for wrap() would be exactly that: make a "black-box" proxy that just calls back to the pypy_* methods. It is clearly better to require that an objspace honours the 'pypy_*' methods instead of requiring that every objectspace implements modules and functions on its own. Code and Frame classes reorganized ---------------------------------- The classes involved in code execution have been reorganized a bit. There are now two abstract base classes: Code and Frame, plus a few concrete classes: Function, Method, and Gateway. A Function object, like CPython's, is essentially a container for a code object with references to a globals dict and default arguments, and possibly other closure stuff. As it is already the case since the Gothenburg sprint, in PyPy there is only one Function class for builtin and user-defined functions, the difference being in the Code object. A Method object is just the same as in CPython. And Gateway objects are what app2interp() and interp2app() return: essentially a Function that isn't bound to any particular object space yet. It isn't supposed to show up at app-level: whenever it gets associated with a space, it becomes a Function. A Code object is some structure that knows its 'signature' (what arguments it expects), and also knows how to build a Frame to run itself in. A Frame represents the execution of a Code. It has an abstract run() method, and a few methods to get and set its locals as a dictionary or as a plain list ("fast locals"). To see how this fits together, see the method call() of Function objects in function.py: def call(self, w_args, w_kwds=None): scope_w = self.parse_args(w_args, w_kwds) frame = self.func_code.create_frame(self.space, self.w_globals, self.closure) frame.setfastscope(scope_w) return frame.run() Argument parsing is done in parse_args() according to self.func_code.signature(); then the Code is asked to create a Frame; the Frame is sent the decoded list of arguments to initialize its locals; and then it is run. The other interface to running code is the method exec_code() of Code in eval.py, which creates a frame and sets its locals as a dictionary before running it: def exec_code(self, space, w_globals, w_locals): "Implements the 'exec' statement." frame = self.create_frame(space, w_globals) frame.setdictscope(w_locals) return frame.run() PyCode is a subclass of Code with all the co_xxx attributes of regular CPython code objects. Quite expectedly, it runs in a PyFrame, which is like a regular CPython frame. Unexpectedly, however, there are now several subclasses of PyFrame, depending on what particular opcodes we need. * The regular one is PyInterpFrame in pyopcode.py. All opcode definitions are now methods of this class instead of being global functions. * There is also PyNestedScopeFrame in nestedscope.py, which adds a few more opcodes, namely the ones related to nested scopes. * And there is GeneratorFrame in generator.py which overrides RETURN_VALUE and defines YIELD_VALUE to provide generator-like behavior. The PyCode class knows which frame to create by inspecting its co_xxx attributes (co_flags tells if we are a generator, and co_cellvars/co_freevars are empty unless we need nested scopes). The nice thing is that apart from these checks, everything about nested scopes is in nestedscope.py, and everything about generators is in generator.py. Contrast this with CPython's huge ceval.c file which mixes all these features throughout the code. Later on we plan to extend that model even more to allow more flexibility, like custom opcodes. There are other, simpler subclasses of Code with their corresponding subclasses of Frame: * in gateway.py, BuiltinCode and BuiltinFrame just call back to an interpreter-level function. This is the code object found in Functions created out of Gateways built by gateway.interp2app(). Such Functions are the equivalent of CPython's built-in functions. Note that BuiltinCode, like all Code subclasses, exposes a signature for Function to be able to decode arguments. Thus, when a built-in function is called: (1) the arguments are parsed in the normal way (e.g. passing arguments by keyword is always allowed) and turned into a "fast locals" list; (2) a BuiltinFrame is created and assigned the "fast locals"; (3) the BuiltinFrame.run() method is called, which just calls the interpreter-level function using the "fast locals" list as arguments. * in objspace/std/typeobject.py, MultimethodCode and a few XxxMmFrame classes implement multimethod-calls. This is the type of the Code objects that you would see by typing '[].append.im_func.func_code' (expect that 'im_func' isn't implemented right now). Enjoy ! Armin & Holger Holger & Armin whatever
participants (1)
-
Armin Rigo