Re: [pypy-dev] bug with vars() in a nested function

In a message of Mon, 22 Dec 2003 22:23:24 +0100, Alex Martelli writes:
Please put them someplace and have goals refer to them. I wonder if this is what we should be using the issue tracker for.... As per your real question -- fast2locals strikes me as a hack, that needs to be more general, and our understanding of what a scope would be has been significantly mangled since getdictscope was written. Something in me says that we want to do scoping in some more cleaner way, but I cannot quite envision what it will be after builtins changes _again_ to be more like a regular module. I am now looking at our current crop of builtins, and thinking ... 'the only reason you lot are there is because CPython had you there'. Now that we have hacked the architecture one more time again, grin, we have something a lot cleaner (for now at any rate) ... I am pushing for a more elegant definition of builtin, based on a pragmatic idea of 'you have to be built in becauswe we cannot make you any other way' What of ours make it? Which cannot be made in app space and why, given that interp space is dead? I think that we have gloriously moved to a place where most of our builtins really do not have to be. Some sort of execfile sort of thing is all we need. This is probably wrong, and indicates that my vision is too idealised, and I miss very real practical problems. I await more enlightenment. But I still think that most of our 'builtins' are more 'reimplentations of CPython builtins, done because CPython doesn't have them so we couldn't have a working Python without them'. And we could implement them as a module. When we have leftovers, real problems in converting one object space to another, then things like getdictscope -- or a more general thing that does this and more actually -- strikes me as a thing that belongs __there__. Scoping rules strike me as object space 'required things to leave a hook out for'. But perhaps I am just confused again, and oversimplifying in my mind. I await enlightenment. Laura

Laura Creighton wrote: To be honest, I'm not quite catching your entire meaning ... So I'll just babble and hope something of what I say strikes close to the mark.
Clarity is strained by the two connotations of "builtin" (i.e. 'always present') that can be meant. That is (for the CPython interpreter): * Written in C and statically linked to the interpreter. (Always present in the interpreter.) * Available in Python without having to import anything. (Always present in the language.) I'm under the impression that the __builtin__ module is so named for the second point, as there are a number of modules which meet the first point. (There happen to be 39 on my copy of CPython - len(sys.builtin_module_names)). PyPy's __builtin__ *has* to meet point #2 - otherwise it wouldn't be Python. But I agree with you on point #1 -- We should push to application level everything we can, and have the interpreter level be the absolute minimum needed in order to make it run.
Let me clarify - are you just referring to the __builtin__ module, or are you advocating a more expansive redesign where ObjSpace and interpreter core gets uplifted to App level?
But perhaps I am just confused again, and oversimplifying in my mind.
I definitely agree with Laura. We should strive to push as much as possible to application level, for no other reason than it will make doing annotations, etc. easier. But there needs to be a good mechanism to provide interpreter level hooks for the app level functions. Take __import__. There is currently a commented out application level version in the builtin module. The __import__ functionality would work fine at application level, except for a few minor issues. You can't get sys.modules from app level, as that would require you to 'import sys', which leads to obvious recursion. Same goes for accessing the filesystem tools in os and os.path. The way the app level function works now is that it defines a set of interpreter level helpers which are able to access the functionality and pass it back. The problem with the way they are implemented now is that all those helpers pollute the __builtin__ namespace. If there was a good way to define interpreter level helpers which were visible from *within* the module, but invisible from the outside, then I feel this approach would work well, and we can extend it to pare the interpreter level functionality down to the bare minimum. -Rocco

Hi Rocco, hi Laura, [Rocco Moretti Mon, Dec 22, 2003 at 05:16:00PM -0600]
Sure, that has been our goal almost all of the time. However, code implemented at application level goes through the interpretation indirection and is not only slower now but will probably remain slower even after translation. Anyway, our new approaches at implementing builtin modules surely improve the simplicity of implementing app-level code and weawing it into interpreter level.
Well, interpreter level code is far from dead but we might be able to reduce it to a minimum level following our original "minimal python" idea. I think that we are not doing so badly as the number of interpreter-level builtins is not all that large. The problem so far has been that the builtin module concept was kind of complicated but this should be fixed soon, now: builtin modules are to be defined at application level but can access/interact very dynamically with interpreter level code at initialization time. I guess Armin will write a few more sentences when he gets to checkin the new stuff.
yes, the main point here is that we probably want to avoid duplicate or redundant state, for example calling on interpreter-level space.builtin.execfile(...) and on app-level __builtin__.execfile(...) should do the same thing but what happens if someone overrides __builtin__.execfile from app level? Do we want the interpreter-level to go through this new implementation or should it keep the "real" reference? It seems tricky to decide this on a case-by-case basis. When doing our recent "implement builtin at app-level and invoke interp-level hooks" hack we had a similar consideration with "sys.modules" which in CPython can be overriden at applevel but it doesn't affect interpreter-level implementations. Otherwise you could get into a state that makes it impossible to import anything anymore (e.g. consider 'sys.modules = "no dict"'). So i am not sure what we want to do about this "duplicate state" issue as there apparently is a flexibility versus security tradeoff involved. I tend to lean towards "flexibility", though :-)
Hmmm, maybe exposing some general '_pypy_' builtin hook would allow defining __import__ at app-level because we could provide a '_pypy_.sys' attribute or maybe better "_pypy_.modules['sys']". I also thought about exposing parts of the objectspace directly, and some builtins could just be bound methods of the space, e.g. len = _pypy_.space.len delattr = _pypy_.space.delattr as the object space and their corresponding builtin implementations share the same signature. Some space-methods would probably be exposed as readonly-attributes unless we want to provide ways to seriously mess up your interpreter quickly :-) I think it's worth a try to see what can of worms suddenly opens if we did this ... cheers, holger

holger krekel wrote:
Fooling around with sys.modules and import on Python2.3 I come away with the idea that Python's idea that variables are just names helps us in certain cases. I.e.:
So it is the particular dictionary that sys.modules points immediately after startup that is used by the CPython import mechanism, not the object that sys.modules is pointing to when the import is called. This is less help for cases like __import__(), where it's what the name is pointing to that matters. Although, I suppose we could possibly handle that through a property-like interface with transparent getters and setters. -Rocco

[Rocco Moretti Tue, Dec 23, 2003 at 09:20:46AM -0600]
Yes, but is it what we want to mimic? Somehow i think the idea is that sys.modules is the one place where modulepath-moduleobject mappings should be kept and the interpreter level should consult this object. I guess that CPython's keeping reference to the original dict object is more a performance hack and also shields from stupid errors ...
We can always special case but i'd prefer a general solution like "interp-level has to go through the applevel hooks/names" but maybe this is not feasible. holger

At 19:05 2003-12-23 +0100, you (holger krekel) wrote:
Maybe the original binding could be preserved as sys.__modules__ analogously to sys.__stdout__ ?
I guess that CPython's keeping reference to the original dict object is more a performance hack and also shields from stupid errors ...
I don't know. Isn't it normal to get a binding through a name and then ignore the name? Mutating the referenced object is a different matter though, e.g.,
Just a couple of thoughts.
This is less help for cases like __import__(), where it's what the doesn't __import__ look for the name in the same original sys.modules?
If the interpreter has to maintain some objects to survive, maybe apps should only get access via readonly/proxy mechanisms of some kind? Disclaimer: I'm only reacting in the context of this one email, so please ignore if it doesn't make sense ;-) Bengt

Laura Creighton wrote: To be honest, I'm not quite catching your entire meaning ... So I'll just babble and hope something of what I say strikes close to the mark.
Clarity is strained by the two connotations of "builtin" (i.e. 'always present') that can be meant. That is (for the CPython interpreter): * Written in C and statically linked to the interpreter. (Always present in the interpreter.) * Available in Python without having to import anything. (Always present in the language.) I'm under the impression that the __builtin__ module is so named for the second point, as there are a number of modules which meet the first point. (There happen to be 39 on my copy of CPython - len(sys.builtin_module_names)). PyPy's __builtin__ *has* to meet point #2 - otherwise it wouldn't be Python. But I agree with you on point #1 -- We should push to application level everything we can, and have the interpreter level be the absolute minimum needed in order to make it run.
Let me clarify - are you just referring to the __builtin__ module, or are you advocating a more expansive redesign where ObjSpace and interpreter core gets uplifted to App level?
But perhaps I am just confused again, and oversimplifying in my mind.
I definitely agree with Laura. We should strive to push as much as possible to application level, for no other reason than it will make doing annotations, etc. easier. But there needs to be a good mechanism to provide interpreter level hooks for the app level functions. Take __import__. There is currently a commented out application level version in the builtin module. The __import__ functionality would work fine at application level, except for a few minor issues. You can't get sys.modules from app level, as that would require you to 'import sys', which leads to obvious recursion. Same goes for accessing the filesystem tools in os and os.path. The way the app level function works now is that it defines a set of interpreter level helpers which are able to access the functionality and pass it back. The problem with the way they are implemented now is that all those helpers pollute the __builtin__ namespace. If there was a good way to define interpreter level helpers which were visible from *within* the module, but invisible from the outside, then I feel this approach would work well, and we can extend it to pare the interpreter level functionality down to the bare minimum. -Rocco

Hi Rocco, hi Laura, [Rocco Moretti Mon, Dec 22, 2003 at 05:16:00PM -0600]
Sure, that has been our goal almost all of the time. However, code implemented at application level goes through the interpretation indirection and is not only slower now but will probably remain slower even after translation. Anyway, our new approaches at implementing builtin modules surely improve the simplicity of implementing app-level code and weawing it into interpreter level.
Well, interpreter level code is far from dead but we might be able to reduce it to a minimum level following our original "minimal python" idea. I think that we are not doing so badly as the number of interpreter-level builtins is not all that large. The problem so far has been that the builtin module concept was kind of complicated but this should be fixed soon, now: builtin modules are to be defined at application level but can access/interact very dynamically with interpreter level code at initialization time. I guess Armin will write a few more sentences when he gets to checkin the new stuff.
yes, the main point here is that we probably want to avoid duplicate or redundant state, for example calling on interpreter-level space.builtin.execfile(...) and on app-level __builtin__.execfile(...) should do the same thing but what happens if someone overrides __builtin__.execfile from app level? Do we want the interpreter-level to go through this new implementation or should it keep the "real" reference? It seems tricky to decide this on a case-by-case basis. When doing our recent "implement builtin at app-level and invoke interp-level hooks" hack we had a similar consideration with "sys.modules" which in CPython can be overriden at applevel but it doesn't affect interpreter-level implementations. Otherwise you could get into a state that makes it impossible to import anything anymore (e.g. consider 'sys.modules = "no dict"'). So i am not sure what we want to do about this "duplicate state" issue as there apparently is a flexibility versus security tradeoff involved. I tend to lean towards "flexibility", though :-)
Hmmm, maybe exposing some general '_pypy_' builtin hook would allow defining __import__ at app-level because we could provide a '_pypy_.sys' attribute or maybe better "_pypy_.modules['sys']". I also thought about exposing parts of the objectspace directly, and some builtins could just be bound methods of the space, e.g. len = _pypy_.space.len delattr = _pypy_.space.delattr as the object space and their corresponding builtin implementations share the same signature. Some space-methods would probably be exposed as readonly-attributes unless we want to provide ways to seriously mess up your interpreter quickly :-) I think it's worth a try to see what can of worms suddenly opens if we did this ... cheers, holger

holger krekel wrote:
Fooling around with sys.modules and import on Python2.3 I come away with the idea that Python's idea that variables are just names helps us in certain cases. I.e.:
So it is the particular dictionary that sys.modules points immediately after startup that is used by the CPython import mechanism, not the object that sys.modules is pointing to when the import is called. This is less help for cases like __import__(), where it's what the name is pointing to that matters. Although, I suppose we could possibly handle that through a property-like interface with transparent getters and setters. -Rocco

[Rocco Moretti Tue, Dec 23, 2003 at 09:20:46AM -0600]
Yes, but is it what we want to mimic? Somehow i think the idea is that sys.modules is the one place where modulepath-moduleobject mappings should be kept and the interpreter level should consult this object. I guess that CPython's keeping reference to the original dict object is more a performance hack and also shields from stupid errors ...
We can always special case but i'd prefer a general solution like "interp-level has to go through the applevel hooks/names" but maybe this is not feasible. holger

At 19:05 2003-12-23 +0100, you (holger krekel) wrote:
Maybe the original binding could be preserved as sys.__modules__ analogously to sys.__stdout__ ?
I guess that CPython's keeping reference to the original dict object is more a performance hack and also shields from stupid errors ...
I don't know. Isn't it normal to get a binding through a name and then ignore the name? Mutating the referenced object is a different matter though, e.g.,
Just a couple of thoughts.
This is less help for cases like __import__(), where it's what the doesn't __import__ look for the name in the same original sys.modules?
If the interpreter has to maintain some objects to survive, maybe apps should only get access via readonly/proxy mechanisms of some kind? Disclaimer: I'm only reacting in the context of this one email, so please ignore if it doesn't make sense ;-) Bengt
participants (4)
-
Bengt Richter
-
holger krekel
-
Laura Creighton
-
Rocco Moretti