Changing existing class instances
data:image/s3,"s3://crabby-images/9ec7e/9ec7e8ea595d0c4a2eeb0031cb9b5944fcf8bc1c" alt=""
Currently, when you replace a class definition with an updated version, it's really difficult to change existing class instances; you'd have to essentially sweep every Python object and check if it's an instance, starting at roots such as __main__ and sys.modules. This makes developing code in a long-running process difficult, Zope being the best example of this. When you modify a class definition used by Zope code, you can't update existing instances floating around in memory. Over dinner, a friend and I were discussing this, and we thought it probably isn't difficult to add an extra level of indirection to allow fixing this. The only other option we could think of is either the complete scan of all objects, or inserting a forwarding pointer into PyClassObjects that points to the replacing class if !NULL, and then chase pointers when accessing PyInstanceObject->in_class. A quick hack to implement the extra indirection took about half an hour. It does these things: * Defines a PyClassHandle type: struct _PyClassHandle { PyClassHandle *next; /* ptr to next PyClassHandle in linked list */ PyClassObject *klass; /* The class object */ } ; * The in_class attribute of PyInstanceObject becomes a PyClassHandle* instead of a PyClassObject*, and all code such as inst->in_class becomes inst->in_class->klass. * As a quick hack to allow changing the class object referenced by a handle, I added a .forward( <newclassobject> ) method to class objects. This basically does self.handle->klass = <newclassobject>. The end result is that obj.__class__.forward(newclass) changes obj to be an instance of newclass, and all other instances of obj.__class__ also mutate to become newclass instances. Making this purely automatic seems hard; you'd have to catch things like 'import ftplib; ftplib.FTP = myclass', which would require automatically calling ftplib.FTP.forward( myclass ) to make all existing FTP instances mutate. Would it be worthwhile to export some hook for doing this in 1.6? The cost is adding an extra pointer deref to all access to PyInstanceObject->in_class. (This could probably also be added to ExtensionClass, and probably doesn't need to be added to core Python to help out Zope. Just a thought...) -- A.M. Kuchling http://starship.python.net/crew/amk/ Here the skull of a consumptive child becomes part of a great machine for calculating the motions of the stars. Here, a yellow bird frets within the ribcage of an unjust man. -- Welcome to Orqwith, in DOOM PATROL #22
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
There might be another solution. When you reload a module, the module object and its dictionary are reused. Perhaps class and function objects could similarly be reused? It would mean that a class or def statement looks for an existing object with the same name and type, and overwrites that. Voila, all references are automatically updated. This is more work (e.g. for classes, a new bytecode may have to be invented because the class creation process must be done differently) but it's much less of a hack, and I think it would be more reliable. (Even though it alters borderline semantics a bit.) (Your extra indirection also slows things down, although I don't know by how much -- not just the extra memory reference but also less locality of reference so more cache hits.) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Guido, on Andrew's idea for automagically updating classes]
Too dangerous, I think. While uncommon in general, I've certainly seen (even written) functions that e.g. return a contained def or class. The intent in such cases is very much to create distinct defs or classes (despite having the same names). In this case I assume "the same name" wouldn't *usually* be found, since the "contained def or class"'s name is local to the containing function. But if there ever happened to be a module-level function or class of the same name, brrrr. Modules differ because their namespace "search path" consists solely of the more-global-than-global <wink> sys.modules.
How about an explicit function in the "new" module, new.update(class_or_def_old, class_or_def_new) which overwrites old's guts with new's guts (in analogy with dict.update)? Then no semantics change and you don't need new bytecodes. In return, a user who wants to e.g. replace an existing class C would need to do oldC = C do whatever they do to get the new C new.update(oldC, C) Building on that, a short Python loop could do the magic for every class and function in a module; and building on *that*, a short "updating import" function could be written in Python. View it as providing mechanism instead of policy <0.9 wink>.
Across the universe of all Python programs on all platforms, weighted by importance, it was a slowdown of nearly 4.317%. if-i-had-used-only-one-digit-everyone-would-have- known-i-was-making-it-up<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/e11a2/e11a2aac1b42dabc568ca327a05edb79113fd96f" alt=""
Oh man, oh man... I think this is where I get to say something akin to "I told you so." :-) I already described Tim's proposal in my type proposal paper, as a way to deal with incomplete classes. Essentially, a class object is created "empty" and is later "updated" with the correct bits. The empty class allows two classes to refer to each other in the "recursive type" scenario. In other words, I definitely would support a new class object behavior that allows us to update a class' set of bases and dictionary on the fly. This could then be used to support my solution for the recursive type scenario (which, in turn, means that we don't have to introduce Yet Another Namespace into Python to hold type names). Note: I would agree with Guido, however, on the "look for a class object with the same name", but with the restriction that the name is only replaced in the *target* namespace. i.e. a "class Foo" in a function will only look for Foo in the function's local namespace; it would not overwrite a class in the global space, nor would it overwrite class objects returned by a prior invocation of the function. Cheers, -g On Thu, 20 Jan 2000, Tim Peters wrote:
-- Greg Stein, http://www.lyra.org/
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Greg Stein]
Parenthetically, I never grasped the appeal of the parenthetical comment. Yet Another Namespace for Yet Another Entirely New Purpose seems highly *desirable* to me! Trying to overload the current namespace set makes it so much harder to see that these are compile-time gimmicks, and users need to be acutely aware of that if they're to use it effectively. Note that I understand (& wholly agree with) the need for runtime introspection. different-things-different-rules-ly y'rs - tim
data:image/s3,"s3://crabby-images/e11a2/e11a2aac1b42dabc568ca327a05edb79113fd96f" alt=""
On Fri, 21 Jan 2000, Tim Peters wrote:
And that is the crux of the issue: I think the names that are assigned to these classes, interfaces, typedefs, or whatever, can follow the standard Python semantics and be plopped into the appropriate namespace. There is no overloading. The compile-time behavior certainly understands what names have what types; in this case, if a name is a "typedecl", then it can remember the *value*, too. When the name is used later, it knows the corresponding value to use. For instance: IntOrString = typedef int|str def foo(x: IntOrString): ... In this example, the type-checker knows that IntOrString is a typedecl. It also knows the *value* of "int|str" so the name IntOrString now has two items associated with it at type-check time: # not "real" syntax, but you get the idea... namespace["IntOrString"] = (TypeDeclarator, int|str) With the above information in hand, the type-checker knows what IntOrString means in the declaration for foo(). The cool benefit is that the runtime semantics are exactly as you would expect: a typedecl object is created and assigned to IntOrString. That object is also associated with the "x" argument in the function object referred to by the name "foo". There is no "overloading" of namespaces. We are using Python namespaces just like they should be, and the type-checker doesn't even have to be all the smart to track this stuff. To get back to the recursive class problem, consider the following code: decl incomplete class Foo decl incomplete class Bar class Foo: decl a: Bar class Bar: decl b: Foo The "decl" statements would create an empty class object and store that into the "current" namespace. There is no need to shove that off into another namespace. When the "class Foo" comes along, the class object is updated with the class definition for Foo. It is conceivable to remove the need for "decl" if you allow "class" and "def" to omit the ": suite" portion of their grammar: class Foo class Bar class Foo: decl a: Bar ... def some_function(x: some_type, y: another_type) -> third_type ... lots o' code ... def some_function(x, y): ... Guido suggested that it may be possible to omit "decl" altogether. Certainly, it can work for member declarations such as: class Foo: a: Bar Anyhow... my point is that a new namespace is not needed. Assuming we want objects for reflection at runtime, then the above proposal states *how* those objects are realized at runtime. Further, the type-checker can easily follow that information and perform the appropriate compile-time checks. No New Namespaces! (lather, rinse, repeat) Cheers, -g -- Greg Stein, http://www.lyra.org/
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Greg Stein wrote:
This is indeed the crux of the issue. For those that missed it last time, it became very clear to me that we are working with radically different design aesthetics when we discussed the idea of having an optional keyword that said: "this thing is usually handled at compile time but I want to handle it at runtime. I know what I am doing." Greg complained that that would require the programmer to understand too much what was being done at compile time and what at runtime and what.
From my point of view this is *exactly* what a programmer *needs* to know and if we make it too hard for them to know it then we have failed.
There is an overloading of namespaces because we will separately specify the *compile time semantics* of these names. We need to separately specify these semantics because we need all compile time type checkers to behave identically. Yes, it seems elegant to make type objects seem as if they are "just like" Python objects. Unfortunately they aren't. Type objects are evaluated -- and accepted or rejected -- at compile time. Every programmer needs to understand that and it should be blatantly obvious in the syntax, just as everything else in Python syntax is blatantly obvious. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Earth will soon support only survivor species -- dandelions, roaches, lizards, thistles, crows, rats. Not to mention 10 billion humans. - Planet of the Weeds, Harper's Magazine, October 1998
data:image/s3,"s3://crabby-images/9f61f/9f61fa716cfbddf8a21d8e8b64baed6256b0d7d7" alt=""
Paul Prescod wrote:
Yes, it seems elegant to make type objects seem as if they are "just like" Python objects. Unfortunately they aren't.
Yes they are. They even have a type, TypeType. -- John (Max) Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850 homepage: http://www.maxtal.com.au/~skaller download: ftp://ftp.cs.usyd.edu/au/jskaller
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
Agreed that that would be bad. But I wouldn't search outer scopes -- I would only look for a class/def that I was about to stomp on.
Modules differ because their namespace "search path" consists solely of the more-global-than-global <wink> sys.modules.
"The search path doesn't enter into it."
Only a slight semantics change (which my full proposal would require too): function objects would become mutable -- their func_code, func_defaults, func_doc and func_globals fields (and, why not, func_name too) should be changeable. If you make all these assignable, it doesn't even have to be a privileged function.
That's certainly a reasonable compromise. Note that the update on a class should imply an update on its methods, right? --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim worries about stomping on unintended classes/defs] [Guido]
Maybe I just don't grasp what that means, exactly. Fair enough, since I'm not expressing myself clearly either! Suppose someone does from Tkinter import * in my.py, and later in my.py just *happens* to define, at module level, class Misc: blah blah blah Now Misc was already in my.py's global namespace because Tkinter.py just happens to export a class of that name too (more by accident than design -- but accidents are what I'm most worried about here). At the time my.py defines Misc, does Misc count as a class we're "about to stomp on"? If so-- & I've assumed so --it would wreak havoc. But if not, I don't see how this case can be reliably distinguished "by magic" from the cases where update is desired (if people are doing dynamic updates to a long-running program, a new version of a class can come from anywhere, so nothing like original file name or line number can distinguish correctly either).
"The search path doesn't enter into it."
I agree, but am at a loss to describe what's happening in the case above using other terminology <wink>. In a sense, you need a system-wide "unique handle" to support bulletproof updating, and while sys.modules has supplied that all along for module objects (in the form of the module name), I don't believe there's anything analogous to key off of for function or class objects.
[suggesting] new.update(class_or_def_old, class_or_def_new)
Of course I meant "no new semantics" in the sense of "won't cause current exception-free code to alter behavior in any way".
If you make all these assignable, it doesn't even have to be a privileged function.
I'm all for that!
[sketching a Python approach to "updating import/reload" building on the hypothetical new.update]
Hadn't considered that! Of course you're right. So make it a pair of nested loops <wink>. so-long-as-it-can-be-written-in-python-it's-easy-ly y'rs - tim
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
For a second I thought you got me there!
Fortunately, there's magic available: recently, all classes have a __module__ attribute that is set to the full name of the module that defined it (its key in __sys__.modules). For functions, we would have to invent something similar.
--Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim, still worried about stomping on unintended classes/defs] [example abusing Tkinter.Misc]
For a second I thought you got me there!
That's twice as long as I thought you'd think that, so I win after all <wink>.
OK! I didn't know about class.__module__ -- I hope you realize that relying on your time machine is making you lazy <wink>. I remain uncomfortable with automagic updating, but not as much so. Both kinds of errors still seem possible to me: 1. Automagically updating when it wasn't wanted. Examples of this are getting harder to come by <wink>. Off the top of my head I'm reduced to stuff like this:
"That kind of thing" has got to be rare, but can't be non-existent either (well, isn't -- I've done it). 2. Failing to automagically update when it was wanted. Implicit in the discussion so far is that long-running systems want to update code at a granularity no finer than module level. Is that realistic? I'm unsure. It's certainly easy to *imagine* the app running an updater server thread, accepting new source for functions and classes, and offering to compile and install the objects. Under the explicit new.update scheme, such a service needn't bother clients with communicating the full name of the original module; heck, in a *truly* long-running app, over time the source tree will change, and classes and functions will migrate across modules. That will be a problem for the explicit scheme too (how does it know *which* "class Misc" to update) -- but at least it's an explicit problem then, and not a "mysterous failure" of hidden magic. I could live with both of those (#1 is more worrisome); but think it easier all around to give the users some tools and tell them to solve the problems however they see fit. or-maybe-we-already-agreed-about-that-ly y'rs - tim
data:image/s3,"s3://crabby-images/12f63/12f63a124acbe324e11def541fbedba0199c815f" alt=""
"TP" == Tim Peters <tim_one@email.msn.com> writes:
TP> Under the explicit new.update scheme, such a service needn't TP> bother clients with communicating the full name of the TP> original module; heck, in a *truly* long-running app, over TP> time the source tree will change, and classes and functions TP> will migrate across modules. That will be a problem for the TP> explicit scheme too (how does it know *which* "class Misc" to TP> update) -- but at least it's an explicit problem then, and not TP> a "mysterous failure" of hidden magic. I completely agree. I think in general, such long running apps are rare, and in those cases you probably want to be explicit about when and how the updates occur anyway. The one place where automatic updates would be convenient would be at the interactive prompt, so it might be nice to add a module that could be imported by PYTHONSTARTUP, and play hook games to enable automatic updates. -Barry
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Barry A. Warsaw]
I completely agree.
That's no fun <wink>.
I think in general, such long running apps are rare,
By definition, they're non-existent under Windows <0.7 wink>. But it depends on which field you're working in. The closer you get to being part of a business or consumer service, the more important it gets; e.g., I've seen serious RFQs for software systems guaranteed to suffer no more than 5 minutes of downtime per *year* (& stiff penalties for failure to meet that). I've never been on the winning end of such an RFQ, so am not sure what it takes to meet it. It's interesting to ponder. Psion has published a little about the software techniques they use in their PDAs (my Psion 3a's remarkably capable "Agenda" app has been running non-stop for a bit over 3 years!).
and in those cases you probably want to be explicit about when and how the updates occur anyway.
My guess is you'd want to be *paranoidly* explicit, leaving nothing to chance.
Returning the favor, I completely agree. The single thing people at work gripe most about is how to do development under IDLE in such a way that their package-laden systems exhibit the hoped-for changes in response to editing a module deep in the bowels of the system. I don't have a *good* answer to that now; reduced to stuff like writing custom scripts to selectively clear out sys.modules. non-stop-ly y'rs - tim
data:image/s3,"s3://crabby-images/e11a2/e11a2aac1b42dabc568ca327a05edb79113fd96f" alt=""
On Thu, 20 Jan 2000, Guido van Rossum wrote:
func.func_globals __module__ and func_globals can prevent *other* modules from redefining something accidentally, but it doesn't prevent Badness from within the module. [ Tim just posted an example of this: his "def adder()" example... ] Cheers, -g -- Greg Stein, http://www.lyra.org/
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
"A.M. Kuchling" wrote:
In the case of Zope, if the objects that you care about happen to be persistent objects, then it's relatively easy to arrange to get the objects flushed from memory and reloaded with the new classes. (There are some subtle issues to deal with, like worrying about multiple threads, but in a development environment, you can deal with these, for example, by limiting the server to one thread.) Note that this is really only a special case of a much larger problem. Reloading a module redefines the global variables in a module. It doesn't update any references to those global references from other places, such as instances or *other* modules. For example, imports like: from foo import spam are not updated when foo is reloaded. Maybe you are expecting too much from reload. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Jim Fulton wrote:
A change to the way that namespaces are handled could make this work and have a number of other benefits, like global name usage without namespace lookups. I've suggested this to Guido in the past. His reasonable response is that this would be too big a change for Python 1. Maybe this is something to consider for Python 2? The basic idea (borrowed from Smalltalk) is to have a kind of dictionary that is a collection of "association" objects. An association object is simply a pairing of a name with a value. Association objects can be shared among multiple namespaces. An import like: from foo import spam would copy the association between the name 'foo' and a value from module 'spam' into the current module. If foo is reloaded or if the name is reassigned in spam, the association is modified and the change is seen in any namespaces that imported foo. Similarly if a function uses a global variable: spam=1 def bar(): global spam return spam*2 the compiled function contains the association between spam and it's value. This means that: - When spam is used in the function, it doesn't have to be looked up, - The function object no longer needs to keep a reference to it's globals. This eliminates an annoying circular reference. (I would not replace existing dictionaries with this new kind. I'd have both kinds available.) I think that this would be a really nice change for Python 2. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
Note: from now on the new name for Python 2 is Python 3000. :-)
I've never liked this very much, mostly because it breaks simplicity: the idea that a namespace is a mapping from names to values (e.g. {"limit": 100, "doit": <function...>, ...}) is beautifully simple, while the idea of inserting an extra level of indirection, no matter how powerful, is much murkier. There's also the huge change in semantics, as you point out; currently, from foo import bar has the same effect (on bar anyway) as import foo bar = foo.bar # i.e. copying an object reference del foo while under your proposal it would be more akin to changing all references to bar to become references to foo.bar. Of course that's what the moral equivalent of "from ... import ..." does in most other languages anyway, so we might consider this for Python 3000; however it would break a considerable amount of old code, I think. (Not to mention brain and book breakage. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
I like it.
How so? It doesn't change the mapping semantics.
Cool. Again, it would also make function global variable access faster and cleaner in some ways.
however it would break a considerable amount of old code, I think.
Really? I wonder. I bet it would break alot less old code that other recent changes.
(Not to mention brain
It makes my brain feel much better. :)
and book breakage. :-)
Hey, all of the books will have to be rewritten for Python 3000. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
[me]
[Jim F]
How so? It doesn't change the mapping semantics.
My assumption is that in your version, the dictionary would contain special <object binding> objects which then would contain the referenced objects. E.g. {"limit": <binding: 100>, "doit": <binding: <function ...>>}. Thus, d["limit"] would be that <binding> object, while previously it would return 100.
Again, it would also make function global variable access faster and cleaner in some ways.
But I have other plans for that (if the optional static typing stuff ever gets implemented).
Oh? Name some changes that broke a lot of code? --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
<meta-comment> Eek, I didn't realized this thread had continued until I happened to notice Christian's post today. <waaa>I get too much email</waaa> </meta-comment> Guido van Rossum wrote:
No. The idea is to have "association" objects. We can create these directly if we want: a=Association('limit',100) print a.key, a.value # whatever The association value is mutable, but the key is not. A namespace object is a collection of association objects such that no two items have the same key. Internally, this would be very much like the current dictionary except that instead of an array of dictentries, you'd have an array of association object pointers. Effectively, associations are exposed dictentries. Externally, a namspace acts more or less like any mapping object. For example, when someone does a getitem, the namespace object will find the association with the desired key and return it's value. In addition, a namspace object would provide methods along the lines of: associations() Return a sequence of the associations in the namespace addAssociation(assoc) Add the given association to the namsspace. This creates another reference to the association. Changing the association's value also changes the value in the namespace. getAssociation(key) Get the association associated with the key. A setitem on a namespace modifies an existing association if there is already an association for the given key. For example: n1=namespace() n1['limit']=100 n2=namespace() n2.addAssociation(n1.getAssociation('limit')) print n2['limit'] # prints 100 n1['limit']=200 print n2['limit'] # prints 200 When a function is compiled that refers to a global variable, we get the association from the global namespace and store it. The function doesn't need to store the global namespace itself, so we don't create a circular reference. Note that circular references are bad even if we have a more powerful gc. For example, by not storing the global namespace in a function, we don't have to worry about the global namespace being blown away before a destructor is run during process exit. When we use the global variable in the function, we simply get the current value from the association. We don't have to look it up. Namespaces would have other benefits: - improve the semantics of: from spam import foo in that you'd be importing a name binding, not a value - Be useful in any application where it's desireable to share a name binding.
Well, OK, but I argue that the namespace idea is much simpler and more foolproof.
The move to class-based exceptions broke alot of our code. Maybe we can drop this point. Do you still think that the namespace idea would break alot of code? Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I presume __setitem__() creates a new association if there isn't one. I also presume that if an association's value is NULL, it doesn't show up in keys(), values() and items() and it doesn't exist for has_key() or __getitem__(). What does a delitem do? Delete the association or set the value to NULL? I suppose the latter.
For this to work we would have to have to change the division of labor between the function object and the code object. The code object is immutable and contains no references to mutable objects; this means that it can easily be marshalled and unmarshalled. (Also, when a code object is compiled or unmarshalled, the globals in which its function will be defined may not exist yet.) The function object currently contains a pointer to the code object and a pointer to the dictionary with the globals. (It also contains the default arg values.) It seems that for associations to work, they need to be placed in the function object, and the code object somehow needs to reference them through the function object. To make this concrete: if a function references globals a, b, and c, these need to be numbered, and the bytecodes should look like this: LOAD_GLOBAL 0 # a STORE_GLOBAL 1 # b DEL_GLOBAL 2 # c (This could be compiled from ``b = a; del c''.) The code object should also contains a list of global names, ordered by their ordinals, e.g. ("a", "b", "c"). Then when the function object is created, it looks in that list and creates a corresponding list of associations, e.g.: L = [] for name in code.co_global_names: L.append(globals.getAssociation(name)) The VM then sticks a pointer to this list into the frame, whenever the function is called (instead of the globals dict which it sticks there now), and the LOAD/STORE/DEL_GLOBAL opcodes reference the associations through this list. Some complications left as exercises: - The built-in functions (and exceptions, etc.) should also be referenced via associations; the loop above would become a bit trickier since it needs to look in two dicts. (We're assuming that the code generator doesn't know which names are globals and which are built-ins.) - If the association for a name doesn't yet exist, it should be created. Note that the semantics are slightly different than currently: the decision whether a name refers to a global or to a built-in is made when the function is defined rather than each time when the name is referenced. This is a bit cleaner -- in the type-sig we're making similar assumptions but the decision is made even earlier. But, overall the necessary changes to the implementation and to the semantics (e.g. of the 'for' statement) seem prohibitive to me. I also think that the namespace implementation will be quite a bit less efficient than a regular dictionary: currently, a dictionary entry is a struct of 12 bytes, and the dictionary has an array of these tightly packed. Your association objects will be "real" objects, which means they have a reference count, a type pointer, a key, and a value, i.e. 16 bytes, without counting the malloc overhead; this probably comes in addition to the 12 bytes in the dict entry. (If you want to have the association objects directly in the hash table, they can't be shared between namespaces, and a namespace couldn't grow -- when a dict grows its hash table is reallocated.)
Note that circular references are bad even if we have a more powerful gc.
I don't understand or believe this statement.
If we had more powerful gc the global namespace wouldn't have to be blown away at all (it would gently dissolve when __main__ was deleted from the interpreter).
But its semantics will be harder to explain, because they will no longer be equivalent to import spam # assume there's no spam already foo = spam.foo del spam Also, we currently *explain* that only objects are shared and name bindings are unique per namespace; this would no longer be true so we would have to explain a much harder rule. ("If you got your foo through an import from another module, assigning to it will affect foo in that other module too; but if you got it through a local assignment, the effect will be local.") All in all, I think these semantics are messy and unacceptable. True, object sharing is hard to explain too (see diagram on Larning Python page 60), but you'll still have to explain that anyway because it still exists within a namespace; but now in addition we'd have to explain that there is an exception to object sharing... Messy, messy.
- Be useful in any application where it's desireable to share a name binding.
I think it's better to explicitly share the namespace -- "foo.bar = 1" makes it clear that whoever else has a reference to foo will see bar similarly changed.
I claim that it's not foolproof at all -- on the contrary, it creates something that hides in the dark and will bite us in the behind by surprise, long after we thought we knew there were no monsters under the bed. (Yes, I've been re-reading Calvin and Hobbes. :-)
It must have been very traumatic that you're still sore over that; it was introduced in 1.5, over two years ago.
Maybe we can drop this point. Do you still think that the namespace idea would break alot of code?
Yes. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
Yes.
Right.
What does a delitem do? Delete the association or set the value to NULL? I suppose the latter.
Good question. I'm inclined to think the former. That is, deleting an item from the namespace would delete the name association. I can see arguments both ways.
Looks good to me. :)
Yup.
Really? Even for Py3K?
I also think that the namespace implementation will be quite a bit less efficient than a regular dictionary:
Spacewise yes. They'd me much faster in use. This is a space/speed tradeoff.
Why not replace the key and value pointers with the association pointer. Then you'd get back a little of the space.
This was discussed at length a year or two ago. You added code to print to stderr when an error occured in a destructor. People noticed that they were getting errors when Python exited. The problem occured when a destructor was called after it's globals had been deallocated. You subsequently added alot of extra rules on shutdown to make this much less likely. I don't think you made the problem go away completely. I find circular references to be bad in other ways. For example, they are a pain with deep copy. You can make deep copy do something in the presense of circular references, but the things it does can be quite surprising.
Uh, OK, then we wouldn't have to worry about the global namespace being gently dissolved before a destructor is run during process exit.
Will they really be harder to explain? Why not explain them a different way? "The statement: from spam import foo copies a name binding for foo from module spam to the current module." Eh, I guess I can see why someone would find this harder....
Good point. Perhaps assinging in the client module should break the connection to the other module. This would require some extra magic.
Well, I don't have a problem with object sharing, so the notion of sharing namespaces doesn't bother me. I undertand that some folks have a problem with object sharing and I agree that they'd have problems with name sharing. OTOH, I don't think you'd consider the fact that some people have difficulty with object sharing to be sufficient justification for removing the feature from the language.
How so?
I'm not sore. But it was a bigger (IMO) backward incompatibility. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Really? Even for Py3K?
The implementation wouldn't be a problem for Py3K; I was under the impression that you thought this could be put in earlier. But the change in semantics is very hard to swallow to me. It definitely seems to be come murkier.
Agreed; though the speedup comes from circumventing the dictionary altogether.
Why not replace the key and value pointers with the association pointer. Then you'd get back a little of the space.
Yes; assuming that a speedy getitem is not an issue, the hash table could become an array of pointers to associations.
Yes -- the concept of a name binding as an object in itself is hard; and it's hard to understand why it is needed.
More murkiness.
Object sharing is something you have to learn very early on; something like "objects drop under gravity". Name binding sharing is something that can effectively be skirted initially, but at some later point it bites you (sort of like mutable default arguments do); this is more comparable to discovering Einstein's relativity.
Because the tendency of tutorials will be to avoid mentioning namespaces at all until you get to the appendix at the end titled "Implementation Details."
I'm not sore. But it was a bigger (IMO) backward incompatibility.
Sometimes a bigger incompatibility that is easy to explain is more acceptable than a very subtle one that breaks code in very subtle ways. Anyway, let's drop this comparison; you can't objectively measure how backwards incompatible something us. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Guido van Rossum wrote:
Jim proposed adding namespaces to Python 2000. This will be, for my understanding, a complete rewrite and redesign that is allowed to break existing code. It would even be run in parallel to Python 1.6++ for a while, right?
I do not even believe in the space ineffectiveness. The namespace concept works fine without a dictionary and hashes at all. We can implement this as a linear list of pointers to namespace objects, since they are looked up only once, usually. But even if we would keep a dictionary-like structure as well, it is possible to implement it as an array of pointers, and you get things smaller than now, not bigger. In your analysis, you forgot to take into account that the average dictionary slot overhead gives a factor of about two. 1 dict slot = 3 words n dict entries = average 2n dict slots = 6n words versus 1 asso object = <ref, type, key, value> = 4 words n asso dict entries = average 2n words + n asso objects This gives 6n words for the proposed solution, actually as effective as today's 6n solution with dicts. Ahem :-) It can of course be that we also need a hash filed, which can be stored in the asso object. This is another word per element, so we'd have a cost increase of 1/6. As said, the dictionary is not necessary and could be created on demand (that is, if globals are really used like a dict). Without it, I count just n + 4n = 5n, actually a saving. This idea bears a lot of potential to also speed up classes and instances. Further analysis is needed, please let us not drop this idea too early. ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
(Math about sapce savings gracefully accepted.)
I will gladly use it as an implementation strategy "under the hood" if it makes sense -- I think that with some code analysis (e.g. "what are the globals here") it could be made to work well. But I don't think that changing the "from M import v" semantics so that local assignment to v changes the binding of M.v as well is defensible. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Guido van Rossum wrote:
Pleased to hear this. Thank you for keeping it alive.
Of course this may be the weakest point yet, while it was the reason to reasoning at all in the first place, it is less important now. :-) ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Christian Tismer wrote:
(snip)
Actually, I've been talking about this in various venues for about 4 years. :) This thread is the first time I've mentioned it the context of import. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
(snip)
I agree, however, I think that having: from M import v causing a name binding that is broken by local assigment to v *is* defensible and reasonably implementable. Changes to 'v' in M (including by reload of M) would be reflected locally unless someone did: v=something locally. Local assignment would negate an import, as it does now. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
[me]
Hm, but it still wouldn't have the same semantics as currently, and that's still a monster hiding under the bed until you're nearly asleep. Consider this example: # in M: verbose = 1 # in __main__: from M import verbose # somewhere else: M.verbose = 0 Under the current semantics, that would have no effect on verbose in __main__; but with your semantics it would. I think that is very hard to explain; even more so if you say that assigning a different value to __main__.verbose does not change M.verbose and furthermore breaks the connection. This means that if I add verbose = verbose to the __main__ code the semantics are different! I don't understand why you wanted these semantics in the first place. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
Agreed. Think Python 3000. I think that the semantics differ in boundary cases though.
Yup.
I'm suggesting a model where from "M import x" has a different meaning than it does now. I think the notion of sharing a name is useful. I'll admit that using "M.x" achieves the same thing, although at a higher performance cost (and, OK, typing cost ;).
I don't understand why you wanted these semantics in the first place.
First, let me say that this isn't super important to me. It does solve a problem with reload, which is the context in which I brought it up. Now, consider: from M import x ..... use(x) Many people would (wrongly) consider this to be equivalent to: import M ..... use(M.x) In fact, I'd *prefer* these to be equivalent even in the face of changes to M (e.g. reload). I'd prefer different semantics. Note that if I had: from M import x ..... x=y ..... use(x) I'd no longer exprect x to have any connection to M. Of course: x=x or x=M.x would be a bit more puzzling, but then they're meant to be. ;) They are addressed by a simple rule, which is that assignment in a module overrides imported name definition. Hm...ooh ooh A better solution would be to disallow assignments to imported names, as they are very likely to be errors. This could be detected without any fancy type inferencing. In fact, we could also decide to disallow an import-from to override an existing name binding. Ahhhhhh. :) In any case, I'd feel comfortable explaining a system in which from M import x # reference semantics wrt name had a different meaning from: import M x=M.x # copy semantics since I expect an attribute access to give me a value, not a name, whereas: from M import x seems more to me like it's talking about names. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/ed7a7/ed7a7726f92567bdedf8038ab1bf73ad9f78f348" alt=""
"JF" == Jim Fulton <jim@digicool.com> writes:
JF> I'm suggesting a model where from "M import x" has a different JF> meaning than it does now. I think the notion of sharing a name JF> is useful. I'll admit that using "M.x" achieves the same thing, JF> although at a higher performance cost (and, OK, typing cost ;). This seems to contradict the 2nd Pythonic principle: Explicit is better than implicit. I don't literally mean to argue that "The Python Way" should be used to make design decisions, but it captures exactly what makes me uncomfortable with the proposed change. [someone else, who could have been channeling me, wrote:]
I don't understand why you wanted these semantics in the first place.
JF> First, let me say that this isn't super important to me. It Glad to hear it <0.3 wink>! JF> does solve a problem with reload, which is the context in which JF> I brought it up. I don't think the reload problem is important enough to justify a change to name binding rules. [much omitted] JF> In any case, I'd feel comfortable explaining a system in which JF> from M import x # reference semantics wrt name JF> had a different meaning from: JF> import M x=M.x # copy semantics JF> since I expect an attribute access to give me a value, not a JF> name, whereas: JF> from M import x JF> seems more to me like it's talking about names. I think the proposed change muddies the semantics of assignment, and I would not feel comfortable trying to explain it. I don't have the same impression vis a vis import and names; I think this is why I disagree and have heretofore been puzzled about why you want this. Assignment binds a name to an object; import is just a variant of assignment. There is no need for a special case. One other worry: How would it create a copy of an object in general? How do you copy a class object or a file or a socket? Since (1) you can't restrict which types of objects are exported by a module and (2) there is no clear definition of copy that applies to any type of object, I don't see how these semantics could be defined. Jeremy
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Jeremy Hylton, on JimF's association objects wrt "from M import x" semantics]
Go ahead & argue it: they were *intended* to be used to guide design decisions! They were my best shot at summarizing what I've learned in a decade of (mostly successful) Guido-channeling. But note that I only listed 19 of the 20 Pythonic Theses: the 20th was left blank, for Guido to fill in however he likes whenever the other 19 suggest a direction he dislikes <wink>. Other relevant theses here include the ones about whether the implementation is, or isn't, easy to explain. I'm suffering an email backlog and haven't yet studied the latest batch on this topic, but a quick skim sure suggests that a concrete implementation isn't self-evident, and its implications perhaps downright subtle regardless. not-a-conclusion-just-a-concern-ly y'rs - tim
data:image/s3,"s3://crabby-images/c7ec1/c7ec163d000cdcd798bbd7cf347e45efd5dc8abb" alt=""
[Jim, Chris & Guido discussing a namespace idea] Guys, I'm lost. Please help me understanding this idea from the start. After rereading this whole thread, I have only a vague intuition of what Jim has proposed, but I fail to understand it; and believe me, I'm very interested in being in sync with you on the subject. Please filter the concept from the consequences and resubmit it once again (in english, through examples, ascii art, whatever). Thanks. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Howdy, lemme try... Vladimir Marangozov wrote:
Naming it different than before, I think this formulation hits the nail on its top: Jim proposes a construction that yields early binding of names, while late binding of values. Sample layout of an association object: <type, refcnt, key, value> with the same key semantics as for dicts. The key object is assigned when the association object is created, that is when the name is seen the first time. The value is still NULL until assigned. Now assume a namespace object as a collection of pointers to asso objects, and assume that it is only extended, nothing deleted. Then, a code object can refer to such a namespace object by giving the index of the asso object. Since the construction of these objects will occour in the same order after marshal/pickling, the offsets in a code object will always be correct. This means: There is a construction that allows to settle a name as soon as it is seen, without necessarily assigning a value. When a function at compile time sees a global, it tries to resolve it by finding an association object in the module's global scope that contains this name. If not found, it is created.
data:image/s3,"s3://crabby-images/c7ec1/c7ec163d000cdcd798bbd7cf347e45efd5dc8abb" alt=""
[Chris comes to my rescue on Jim's namespace idea] Christian Tismer wrote:
Ahaa. Got it. Thank you Chris! So naming is the same. Binding and name resolution are different. This is certainly a valuable idea in some foreseeble situations (like the globals pre-binding for a code object you're describing -- sort of a cache/array for globals, with initially invalidated entries). But the problem is that this indirection has so much power in it, that generalizing it to all namespaces seems to hide all kinds of surprises. I'm not in a position even to figure out what the implications could be (it smells "out of bounds"), but it certainly needs more digging. I suspect that if it turns out that these intermediate contexts cannot be generalized, their implementation may be compromised for the few identified cases where they are expected to be useful.
hoping it was clear enough - ciao - chris
Yes, but embracing it all is still a "so-so"... -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Jim Fulton]
Jim, I've been intrigued by this idea for all the years you've been suggesting it <wink>, but I've never understood what it is you're proposing! This is the Python-Dev list, so feel encouraged to present it in concrete implementation terms instead of ambiguous English. Or maybe an interface? interface a_kind_of_dictionary_that_is_a_collection_of_\ association_objects: # ??? beats me ... Or maybe as a C struct? For example, is "an association object" a (char*, PyObject*) pair? Does this kind of dictionary have keys? If so, of what type? What type are the values? Best I can make sense of the above, the values are "association objects", each of which contains a name and a value, and a key is maybe a duplicate of the name in the association object to which it maps. "A name" may or may not be a string -- I can't tell. Or maybe by "dictionary" you didn't intend Python's current meaning for that word at all. I assume "a value" is a PyObject*. The whole thrust *appears* to be to get names to map to a PyObject** instead of PyObject*, but if that's the ticket I don't know what association objeects have to do with it.
Where does the idea that 'spam' is a *module* here come from? It doesn't make sense to me, and I'm so lost I'll spare everyone my further confusions <wink>. suspecting-the-last-actually-doesn't-make-any-sense<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Tim Peters wrote:
[Jim Fulton]
[association objects]
My guess is: An association object adds another level of indirection to namespaces and makes global variables be more like true variables, i.e. changing them in one place changes them everywhere.
I don't believe that the actual implementation matters too much and is still open to be choosen. Here my approach: Let an association object be a pair of a key and a value. The restrictions for keys may be the same as for dict keys. We can now either use dicts as they are, inserting asso-objects as values and sharing the key field, or invent new dictionaries which have no key/value pairs at all, but just references to asso-objects. In either case, we have the advantage that further references by global use from a function or by imports will always add to the asso-object, not to its value. This keeps the value changeable, like a list with one element, kind of boxed object. Since the asso-objects stay alive as long as they are referenced, they are never moved, and it is ok to refer to their address. For a function, this means that it can resolve a global at compile time. If the asso-object exists already, it has a fixed memory address and can be placed into the code object. If it does not exist, it can be created in the global dictionary or special asso-dictionary, whatever we'll use. The value will be NULL in this case, and this is perfect. If we do right, a value will have been inserted before the function is called, or we will raise a name error. The idea is simply to generate fixed slots for global names which never move. By mentioning the name, we create such a slot. The slot is alive as long it is seen, i.e. refcount > 0. There must be a general way to look these things up, either by the per-module dictionary, or by a specialized one. Finally I'd tend to do the latter, since those unitialized key/value asso-objects would give ambiguity what dict.keys() should be then. For consistency, I would hide all asso-objects in a special asso-collection per module. They could be placed into the modules dict, when their value becomes assigned first time. Alternatively, they are not created at compile time but at runtime, when a value is assigned. I'm not sure yet. Now, moving on from globals to all name spaces: If they are all handled by the asso-approach, can we use it to speed up attribute access for classes and instances? I guess we can! But I need more thought.
def swap_words(str, one, two): pieces = string.split(str, one) for i in range(len(pieces)): pieces[i] = string.replace(pieces[i], two, one) return string.join(pieces, two) sentence = swap_words(sentence, "'foo'", "'spam'") ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Christian Tismer wrote:
Ugh. Sorry to make you guess....
Right.
Right, Replace dictentries with association object pointers.
In either case, we have the advantage that further references by global use from a function or by imports will always add
to the refcount of
Yup.
exactly. You are a great guesser! :)
Yes, it needs more thought.
Ooh ooh, you've invented a 'Jim translator bot'! Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Tim Peters wrote:
I just responded to Guido in a bit more detail. Hopefully, this will be of sufficient clarity. If not, then I'll be happy to work up a Python demonstration.
Does this kind of dictionary have keys?
Yes.
If so, of what type?
Whatever you want. Just like a dictionary.
What type are the values?
ditto.
Sorry. See my reply to Guido and let me know if I'm still being too vague.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
There might be another solution. When you reload a module, the module object and its dictionary are reused. Perhaps class and function objects could similarly be reused? It would mean that a class or def statement looks for an existing object with the same name and type, and overwrites that. Voila, all references are automatically updated. This is more work (e.g. for classes, a new bytecode may have to be invented because the class creation process must be done differently) but it's much less of a hack, and I think it would be more reliable. (Even though it alters borderline semantics a bit.) (Your extra indirection also slows things down, although I don't know by how much -- not just the extra memory reference but also less locality of reference so more cache hits.) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Guido, on Andrew's idea for automagically updating classes]
Too dangerous, I think. While uncommon in general, I've certainly seen (even written) functions that e.g. return a contained def or class. The intent in such cases is very much to create distinct defs or classes (despite having the same names). In this case I assume "the same name" wouldn't *usually* be found, since the "contained def or class"'s name is local to the containing function. But if there ever happened to be a module-level function or class of the same name, brrrr. Modules differ because their namespace "search path" consists solely of the more-global-than-global <wink> sys.modules.
How about an explicit function in the "new" module, new.update(class_or_def_old, class_or_def_new) which overwrites old's guts with new's guts (in analogy with dict.update)? Then no semantics change and you don't need new bytecodes. In return, a user who wants to e.g. replace an existing class C would need to do oldC = C do whatever they do to get the new C new.update(oldC, C) Building on that, a short Python loop could do the magic for every class and function in a module; and building on *that*, a short "updating import" function could be written in Python. View it as providing mechanism instead of policy <0.9 wink>.
Across the universe of all Python programs on all platforms, weighted by importance, it was a slowdown of nearly 4.317%. if-i-had-used-only-one-digit-everyone-would-have- known-i-was-making-it-up<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/e11a2/e11a2aac1b42dabc568ca327a05edb79113fd96f" alt=""
Oh man, oh man... I think this is where I get to say something akin to "I told you so." :-) I already described Tim's proposal in my type proposal paper, as a way to deal with incomplete classes. Essentially, a class object is created "empty" and is later "updated" with the correct bits. The empty class allows two classes to refer to each other in the "recursive type" scenario. In other words, I definitely would support a new class object behavior that allows us to update a class' set of bases and dictionary on the fly. This could then be used to support my solution for the recursive type scenario (which, in turn, means that we don't have to introduce Yet Another Namespace into Python to hold type names). Note: I would agree with Guido, however, on the "look for a class object with the same name", but with the restriction that the name is only replaced in the *target* namespace. i.e. a "class Foo" in a function will only look for Foo in the function's local namespace; it would not overwrite a class in the global space, nor would it overwrite class objects returned by a prior invocation of the function. Cheers, -g On Thu, 20 Jan 2000, Tim Peters wrote:
-- Greg Stein, http://www.lyra.org/
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Greg Stein]
Parenthetically, I never grasped the appeal of the parenthetical comment. Yet Another Namespace for Yet Another Entirely New Purpose seems highly *desirable* to me! Trying to overload the current namespace set makes it so much harder to see that these are compile-time gimmicks, and users need to be acutely aware of that if they're to use it effectively. Note that I understand (& wholly agree with) the need for runtime introspection. different-things-different-rules-ly y'rs - tim
data:image/s3,"s3://crabby-images/e11a2/e11a2aac1b42dabc568ca327a05edb79113fd96f" alt=""
On Fri, 21 Jan 2000, Tim Peters wrote:
And that is the crux of the issue: I think the names that are assigned to these classes, interfaces, typedefs, or whatever, can follow the standard Python semantics and be plopped into the appropriate namespace. There is no overloading. The compile-time behavior certainly understands what names have what types; in this case, if a name is a "typedecl", then it can remember the *value*, too. When the name is used later, it knows the corresponding value to use. For instance: IntOrString = typedef int|str def foo(x: IntOrString): ... In this example, the type-checker knows that IntOrString is a typedecl. It also knows the *value* of "int|str" so the name IntOrString now has two items associated with it at type-check time: # not "real" syntax, but you get the idea... namespace["IntOrString"] = (TypeDeclarator, int|str) With the above information in hand, the type-checker knows what IntOrString means in the declaration for foo(). The cool benefit is that the runtime semantics are exactly as you would expect: a typedecl object is created and assigned to IntOrString. That object is also associated with the "x" argument in the function object referred to by the name "foo". There is no "overloading" of namespaces. We are using Python namespaces just like they should be, and the type-checker doesn't even have to be all the smart to track this stuff. To get back to the recursive class problem, consider the following code: decl incomplete class Foo decl incomplete class Bar class Foo: decl a: Bar class Bar: decl b: Foo The "decl" statements would create an empty class object and store that into the "current" namespace. There is no need to shove that off into another namespace. When the "class Foo" comes along, the class object is updated with the class definition for Foo. It is conceivable to remove the need for "decl" if you allow "class" and "def" to omit the ": suite" portion of their grammar: class Foo class Bar class Foo: decl a: Bar ... def some_function(x: some_type, y: another_type) -> third_type ... lots o' code ... def some_function(x, y): ... Guido suggested that it may be possible to omit "decl" altogether. Certainly, it can work for member declarations such as: class Foo: a: Bar Anyhow... my point is that a new namespace is not needed. Assuming we want objects for reflection at runtime, then the above proposal states *how* those objects are realized at runtime. Further, the type-checker can easily follow that information and perform the appropriate compile-time checks. No New Namespaces! (lather, rinse, repeat) Cheers, -g -- Greg Stein, http://www.lyra.org/
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Greg Stein wrote:
This is indeed the crux of the issue. For those that missed it last time, it became very clear to me that we are working with radically different design aesthetics when we discussed the idea of having an optional keyword that said: "this thing is usually handled at compile time but I want to handle it at runtime. I know what I am doing." Greg complained that that would require the programmer to understand too much what was being done at compile time and what at runtime and what.
From my point of view this is *exactly* what a programmer *needs* to know and if we make it too hard for them to know it then we have failed.
There is an overloading of namespaces because we will separately specify the *compile time semantics* of these names. We need to separately specify these semantics because we need all compile time type checkers to behave identically. Yes, it seems elegant to make type objects seem as if they are "just like" Python objects. Unfortunately they aren't. Type objects are evaluated -- and accepted or rejected -- at compile time. Every programmer needs to understand that and it should be blatantly obvious in the syntax, just as everything else in Python syntax is blatantly obvious. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Earth will soon support only survivor species -- dandelions, roaches, lizards, thistles, crows, rats. Not to mention 10 billion humans. - Planet of the Weeds, Harper's Magazine, October 1998
data:image/s3,"s3://crabby-images/9f61f/9f61fa716cfbddf8a21d8e8b64baed6256b0d7d7" alt=""
Paul Prescod wrote:
Yes, it seems elegant to make type objects seem as if they are "just like" Python objects. Unfortunately they aren't.
Yes they are. They even have a type, TypeType. -- John (Max) Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850 homepage: http://www.maxtal.com.au/~skaller download: ftp://ftp.cs.usyd.edu/au/jskaller
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
Agreed that that would be bad. But I wouldn't search outer scopes -- I would only look for a class/def that I was about to stomp on.
Modules differ because their namespace "search path" consists solely of the more-global-than-global <wink> sys.modules.
"The search path doesn't enter into it."
Only a slight semantics change (which my full proposal would require too): function objects would become mutable -- their func_code, func_defaults, func_doc and func_globals fields (and, why not, func_name too) should be changeable. If you make all these assignable, it doesn't even have to be a privileged function.
That's certainly a reasonable compromise. Note that the update on a class should imply an update on its methods, right? --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim worries about stomping on unintended classes/defs] [Guido]
Maybe I just don't grasp what that means, exactly. Fair enough, since I'm not expressing myself clearly either! Suppose someone does from Tkinter import * in my.py, and later in my.py just *happens* to define, at module level, class Misc: blah blah blah Now Misc was already in my.py's global namespace because Tkinter.py just happens to export a class of that name too (more by accident than design -- but accidents are what I'm most worried about here). At the time my.py defines Misc, does Misc count as a class we're "about to stomp on"? If so-- & I've assumed so --it would wreak havoc. But if not, I don't see how this case can be reliably distinguished "by magic" from the cases where update is desired (if people are doing dynamic updates to a long-running program, a new version of a class can come from anywhere, so nothing like original file name or line number can distinguish correctly either).
"The search path doesn't enter into it."
I agree, but am at a loss to describe what's happening in the case above using other terminology <wink>. In a sense, you need a system-wide "unique handle" to support bulletproof updating, and while sys.modules has supplied that all along for module objects (in the form of the module name), I don't believe there's anything analogous to key off of for function or class objects.
[suggesting] new.update(class_or_def_old, class_or_def_new)
Of course I meant "no new semantics" in the sense of "won't cause current exception-free code to alter behavior in any way".
If you make all these assignable, it doesn't even have to be a privileged function.
I'm all for that!
[sketching a Python approach to "updating import/reload" building on the hypothetical new.update]
Hadn't considered that! Of course you're right. So make it a pair of nested loops <wink>. so-long-as-it-can-be-written-in-python-it's-easy-ly y'rs - tim
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
For a second I thought you got me there!
Fortunately, there's magic available: recently, all classes have a __module__ attribute that is set to the full name of the module that defined it (its key in __sys__.modules). For functions, we would have to invent something similar.
--Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim, still worried about stomping on unintended classes/defs] [example abusing Tkinter.Misc]
For a second I thought you got me there!
That's twice as long as I thought you'd think that, so I win after all <wink>.
OK! I didn't know about class.__module__ -- I hope you realize that relying on your time machine is making you lazy <wink>. I remain uncomfortable with automagic updating, but not as much so. Both kinds of errors still seem possible to me: 1. Automagically updating when it wasn't wanted. Examples of this are getting harder to come by <wink>. Off the top of my head I'm reduced to stuff like this:
"That kind of thing" has got to be rare, but can't be non-existent either (well, isn't -- I've done it). 2. Failing to automagically update when it was wanted. Implicit in the discussion so far is that long-running systems want to update code at a granularity no finer than module level. Is that realistic? I'm unsure. It's certainly easy to *imagine* the app running an updater server thread, accepting new source for functions and classes, and offering to compile and install the objects. Under the explicit new.update scheme, such a service needn't bother clients with communicating the full name of the original module; heck, in a *truly* long-running app, over time the source tree will change, and classes and functions will migrate across modules. That will be a problem for the explicit scheme too (how does it know *which* "class Misc" to update) -- but at least it's an explicit problem then, and not a "mysterous failure" of hidden magic. I could live with both of those (#1 is more worrisome); but think it easier all around to give the users some tools and tell them to solve the problems however they see fit. or-maybe-we-already-agreed-about-that-ly y'rs - tim
data:image/s3,"s3://crabby-images/12f63/12f63a124acbe324e11def541fbedba0199c815f" alt=""
"TP" == Tim Peters <tim_one@email.msn.com> writes:
TP> Under the explicit new.update scheme, such a service needn't TP> bother clients with communicating the full name of the TP> original module; heck, in a *truly* long-running app, over TP> time the source tree will change, and classes and functions TP> will migrate across modules. That will be a problem for the TP> explicit scheme too (how does it know *which* "class Misc" to TP> update) -- but at least it's an explicit problem then, and not TP> a "mysterous failure" of hidden magic. I completely agree. I think in general, such long running apps are rare, and in those cases you probably want to be explicit about when and how the updates occur anyway. The one place where automatic updates would be convenient would be at the interactive prompt, so it might be nice to add a module that could be imported by PYTHONSTARTUP, and play hook games to enable automatic updates. -Barry
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Barry A. Warsaw]
I completely agree.
That's no fun <wink>.
I think in general, such long running apps are rare,
By definition, they're non-existent under Windows <0.7 wink>. But it depends on which field you're working in. The closer you get to being part of a business or consumer service, the more important it gets; e.g., I've seen serious RFQs for software systems guaranteed to suffer no more than 5 minutes of downtime per *year* (& stiff penalties for failure to meet that). I've never been on the winning end of such an RFQ, so am not sure what it takes to meet it. It's interesting to ponder. Psion has published a little about the software techniques they use in their PDAs (my Psion 3a's remarkably capable "Agenda" app has been running non-stop for a bit over 3 years!).
and in those cases you probably want to be explicit about when and how the updates occur anyway.
My guess is you'd want to be *paranoidly* explicit, leaving nothing to chance.
Returning the favor, I completely agree. The single thing people at work gripe most about is how to do development under IDLE in such a way that their package-laden systems exhibit the hoped-for changes in response to editing a module deep in the bowels of the system. I don't have a *good* answer to that now; reduced to stuff like writing custom scripts to selectively clear out sys.modules. non-stop-ly y'rs - tim
data:image/s3,"s3://crabby-images/e11a2/e11a2aac1b42dabc568ca327a05edb79113fd96f" alt=""
On Thu, 20 Jan 2000, Guido van Rossum wrote:
func.func_globals __module__ and func_globals can prevent *other* modules from redefining something accidentally, but it doesn't prevent Badness from within the module. [ Tim just posted an example of this: his "def adder()" example... ] Cheers, -g -- Greg Stein, http://www.lyra.org/
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
"A.M. Kuchling" wrote:
In the case of Zope, if the objects that you care about happen to be persistent objects, then it's relatively easy to arrange to get the objects flushed from memory and reloaded with the new classes. (There are some subtle issues to deal with, like worrying about multiple threads, but in a development environment, you can deal with these, for example, by limiting the server to one thread.) Note that this is really only a special case of a much larger problem. Reloading a module redefines the global variables in a module. It doesn't update any references to those global references from other places, such as instances or *other* modules. For example, imports like: from foo import spam are not updated when foo is reloaded. Maybe you are expecting too much from reload. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Jim Fulton wrote:
A change to the way that namespaces are handled could make this work and have a number of other benefits, like global name usage without namespace lookups. I've suggested this to Guido in the past. His reasonable response is that this would be too big a change for Python 1. Maybe this is something to consider for Python 2? The basic idea (borrowed from Smalltalk) is to have a kind of dictionary that is a collection of "association" objects. An association object is simply a pairing of a name with a value. Association objects can be shared among multiple namespaces. An import like: from foo import spam would copy the association between the name 'foo' and a value from module 'spam' into the current module. If foo is reloaded or if the name is reassigned in spam, the association is modified and the change is seen in any namespaces that imported foo. Similarly if a function uses a global variable: spam=1 def bar(): global spam return spam*2 the compiled function contains the association between spam and it's value. This means that: - When spam is used in the function, it doesn't have to be looked up, - The function object no longer needs to keep a reference to it's globals. This eliminates an annoying circular reference. (I would not replace existing dictionaries with this new kind. I'd have both kinds available.) I think that this would be a really nice change for Python 2. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
Note: from now on the new name for Python 2 is Python 3000. :-)
I've never liked this very much, mostly because it breaks simplicity: the idea that a namespace is a mapping from names to values (e.g. {"limit": 100, "doit": <function...>, ...}) is beautifully simple, while the idea of inserting an extra level of indirection, no matter how powerful, is much murkier. There's also the huge change in semantics, as you point out; currently, from foo import bar has the same effect (on bar anyway) as import foo bar = foo.bar # i.e. copying an object reference del foo while under your proposal it would be more akin to changing all references to bar to become references to foo.bar. Of course that's what the moral equivalent of "from ... import ..." does in most other languages anyway, so we might consider this for Python 3000; however it would break a considerable amount of old code, I think. (Not to mention brain and book breakage. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
I like it.
How so? It doesn't change the mapping semantics.
Cool. Again, it would also make function global variable access faster and cleaner in some ways.
however it would break a considerable amount of old code, I think.
Really? I wonder. I bet it would break alot less old code that other recent changes.
(Not to mention brain
It makes my brain feel much better. :)
and book breakage. :-)
Hey, all of the books will have to be rewritten for Python 3000. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
[me]
[Jim F]
How so? It doesn't change the mapping semantics.
My assumption is that in your version, the dictionary would contain special <object binding> objects which then would contain the referenced objects. E.g. {"limit": <binding: 100>, "doit": <binding: <function ...>>}. Thus, d["limit"] would be that <binding> object, while previously it would return 100.
Again, it would also make function global variable access faster and cleaner in some ways.
But I have other plans for that (if the optional static typing stuff ever gets implemented).
Oh? Name some changes that broke a lot of code? --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
<meta-comment> Eek, I didn't realized this thread had continued until I happened to notice Christian's post today. <waaa>I get too much email</waaa> </meta-comment> Guido van Rossum wrote:
No. The idea is to have "association" objects. We can create these directly if we want: a=Association('limit',100) print a.key, a.value # whatever The association value is mutable, but the key is not. A namespace object is a collection of association objects such that no two items have the same key. Internally, this would be very much like the current dictionary except that instead of an array of dictentries, you'd have an array of association object pointers. Effectively, associations are exposed dictentries. Externally, a namspace acts more or less like any mapping object. For example, when someone does a getitem, the namespace object will find the association with the desired key and return it's value. In addition, a namspace object would provide methods along the lines of: associations() Return a sequence of the associations in the namespace addAssociation(assoc) Add the given association to the namsspace. This creates another reference to the association. Changing the association's value also changes the value in the namespace. getAssociation(key) Get the association associated with the key. A setitem on a namespace modifies an existing association if there is already an association for the given key. For example: n1=namespace() n1['limit']=100 n2=namespace() n2.addAssociation(n1.getAssociation('limit')) print n2['limit'] # prints 100 n1['limit']=200 print n2['limit'] # prints 200 When a function is compiled that refers to a global variable, we get the association from the global namespace and store it. The function doesn't need to store the global namespace itself, so we don't create a circular reference. Note that circular references are bad even if we have a more powerful gc. For example, by not storing the global namespace in a function, we don't have to worry about the global namespace being blown away before a destructor is run during process exit. When we use the global variable in the function, we simply get the current value from the association. We don't have to look it up. Namespaces would have other benefits: - improve the semantics of: from spam import foo in that you'd be importing a name binding, not a value - Be useful in any application where it's desireable to share a name binding.
Well, OK, but I argue that the namespace idea is much simpler and more foolproof.
The move to class-based exceptions broke alot of our code. Maybe we can drop this point. Do you still think that the namespace idea would break alot of code? Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I presume __setitem__() creates a new association if there isn't one. I also presume that if an association's value is NULL, it doesn't show up in keys(), values() and items() and it doesn't exist for has_key() or __getitem__(). What does a delitem do? Delete the association or set the value to NULL? I suppose the latter.
For this to work we would have to have to change the division of labor between the function object and the code object. The code object is immutable and contains no references to mutable objects; this means that it can easily be marshalled and unmarshalled. (Also, when a code object is compiled or unmarshalled, the globals in which its function will be defined may not exist yet.) The function object currently contains a pointer to the code object and a pointer to the dictionary with the globals. (It also contains the default arg values.) It seems that for associations to work, they need to be placed in the function object, and the code object somehow needs to reference them through the function object. To make this concrete: if a function references globals a, b, and c, these need to be numbered, and the bytecodes should look like this: LOAD_GLOBAL 0 # a STORE_GLOBAL 1 # b DEL_GLOBAL 2 # c (This could be compiled from ``b = a; del c''.) The code object should also contains a list of global names, ordered by their ordinals, e.g. ("a", "b", "c"). Then when the function object is created, it looks in that list and creates a corresponding list of associations, e.g.: L = [] for name in code.co_global_names: L.append(globals.getAssociation(name)) The VM then sticks a pointer to this list into the frame, whenever the function is called (instead of the globals dict which it sticks there now), and the LOAD/STORE/DEL_GLOBAL opcodes reference the associations through this list. Some complications left as exercises: - The built-in functions (and exceptions, etc.) should also be referenced via associations; the loop above would become a bit trickier since it needs to look in two dicts. (We're assuming that the code generator doesn't know which names are globals and which are built-ins.) - If the association for a name doesn't yet exist, it should be created. Note that the semantics are slightly different than currently: the decision whether a name refers to a global or to a built-in is made when the function is defined rather than each time when the name is referenced. This is a bit cleaner -- in the type-sig we're making similar assumptions but the decision is made even earlier. But, overall the necessary changes to the implementation and to the semantics (e.g. of the 'for' statement) seem prohibitive to me. I also think that the namespace implementation will be quite a bit less efficient than a regular dictionary: currently, a dictionary entry is a struct of 12 bytes, and the dictionary has an array of these tightly packed. Your association objects will be "real" objects, which means they have a reference count, a type pointer, a key, and a value, i.e. 16 bytes, without counting the malloc overhead; this probably comes in addition to the 12 bytes in the dict entry. (If you want to have the association objects directly in the hash table, they can't be shared between namespaces, and a namespace couldn't grow -- when a dict grows its hash table is reallocated.)
Note that circular references are bad even if we have a more powerful gc.
I don't understand or believe this statement.
If we had more powerful gc the global namespace wouldn't have to be blown away at all (it would gently dissolve when __main__ was deleted from the interpreter).
But its semantics will be harder to explain, because they will no longer be equivalent to import spam # assume there's no spam already foo = spam.foo del spam Also, we currently *explain* that only objects are shared and name bindings are unique per namespace; this would no longer be true so we would have to explain a much harder rule. ("If you got your foo through an import from another module, assigning to it will affect foo in that other module too; but if you got it through a local assignment, the effect will be local.") All in all, I think these semantics are messy and unacceptable. True, object sharing is hard to explain too (see diagram on Larning Python page 60), but you'll still have to explain that anyway because it still exists within a namespace; but now in addition we'd have to explain that there is an exception to object sharing... Messy, messy.
- Be useful in any application where it's desireable to share a name binding.
I think it's better to explicitly share the namespace -- "foo.bar = 1" makes it clear that whoever else has a reference to foo will see bar similarly changed.
I claim that it's not foolproof at all -- on the contrary, it creates something that hides in the dark and will bite us in the behind by surprise, long after we thought we knew there were no monsters under the bed. (Yes, I've been re-reading Calvin and Hobbes. :-)
It must have been very traumatic that you're still sore over that; it was introduced in 1.5, over two years ago.
Maybe we can drop this point. Do you still think that the namespace idea would break alot of code?
Yes. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
Yes.
Right.
What does a delitem do? Delete the association or set the value to NULL? I suppose the latter.
Good question. I'm inclined to think the former. That is, deleting an item from the namespace would delete the name association. I can see arguments both ways.
Looks good to me. :)
Yup.
Really? Even for Py3K?
I also think that the namespace implementation will be quite a bit less efficient than a regular dictionary:
Spacewise yes. They'd me much faster in use. This is a space/speed tradeoff.
Why not replace the key and value pointers with the association pointer. Then you'd get back a little of the space.
This was discussed at length a year or two ago. You added code to print to stderr when an error occured in a destructor. People noticed that they were getting errors when Python exited. The problem occured when a destructor was called after it's globals had been deallocated. You subsequently added alot of extra rules on shutdown to make this much less likely. I don't think you made the problem go away completely. I find circular references to be bad in other ways. For example, they are a pain with deep copy. You can make deep copy do something in the presense of circular references, but the things it does can be quite surprising.
Uh, OK, then we wouldn't have to worry about the global namespace being gently dissolved before a destructor is run during process exit.
Will they really be harder to explain? Why not explain them a different way? "The statement: from spam import foo copies a name binding for foo from module spam to the current module." Eh, I guess I can see why someone would find this harder....
Good point. Perhaps assinging in the client module should break the connection to the other module. This would require some extra magic.
Well, I don't have a problem with object sharing, so the notion of sharing namespaces doesn't bother me. I undertand that some folks have a problem with object sharing and I agree that they'd have problems with name sharing. OTOH, I don't think you'd consider the fact that some people have difficulty with object sharing to be sufficient justification for removing the feature from the language.
How so?
I'm not sore. But it was a bigger (IMO) backward incompatibility. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Really? Even for Py3K?
The implementation wouldn't be a problem for Py3K; I was under the impression that you thought this could be put in earlier. But the change in semantics is very hard to swallow to me. It definitely seems to be come murkier.
Agreed; though the speedup comes from circumventing the dictionary altogether.
Why not replace the key and value pointers with the association pointer. Then you'd get back a little of the space.
Yes; assuming that a speedy getitem is not an issue, the hash table could become an array of pointers to associations.
Yes -- the concept of a name binding as an object in itself is hard; and it's hard to understand why it is needed.
More murkiness.
Object sharing is something you have to learn very early on; something like "objects drop under gravity". Name binding sharing is something that can effectively be skirted initially, but at some later point it bites you (sort of like mutable default arguments do); this is more comparable to discovering Einstein's relativity.
Because the tendency of tutorials will be to avoid mentioning namespaces at all until you get to the appendix at the end titled "Implementation Details."
I'm not sore. But it was a bigger (IMO) backward incompatibility.
Sometimes a bigger incompatibility that is easy to explain is more acceptable than a very subtle one that breaks code in very subtle ways. Anyway, let's drop this comparison; you can't objectively measure how backwards incompatible something us. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Guido van Rossum wrote:
Jim proposed adding namespaces to Python 2000. This will be, for my understanding, a complete rewrite and redesign that is allowed to break existing code. It would even be run in parallel to Python 1.6++ for a while, right?
I do not even believe in the space ineffectiveness. The namespace concept works fine without a dictionary and hashes at all. We can implement this as a linear list of pointers to namespace objects, since they are looked up only once, usually. But even if we would keep a dictionary-like structure as well, it is possible to implement it as an array of pointers, and you get things smaller than now, not bigger. In your analysis, you forgot to take into account that the average dictionary slot overhead gives a factor of about two. 1 dict slot = 3 words n dict entries = average 2n dict slots = 6n words versus 1 asso object = <ref, type, key, value> = 4 words n asso dict entries = average 2n words + n asso objects This gives 6n words for the proposed solution, actually as effective as today's 6n solution with dicts. Ahem :-) It can of course be that we also need a hash filed, which can be stored in the asso object. This is another word per element, so we'd have a cost increase of 1/6. As said, the dictionary is not necessary and could be created on demand (that is, if globals are really used like a dict). Without it, I count just n + 4n = 5n, actually a saving. This idea bears a lot of potential to also speed up classes and instances. Further analysis is needed, please let us not drop this idea too early. ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
(Math about sapce savings gracefully accepted.)
I will gladly use it as an implementation strategy "under the hood" if it makes sense -- I think that with some code analysis (e.g. "what are the globals here") it could be made to work well. But I don't think that changing the "from M import v" semantics so that local assignment to v changes the binding of M.v as well is defensible. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Guido van Rossum wrote:
Pleased to hear this. Thank you for keeping it alive.
Of course this may be the weakest point yet, while it was the reason to reasoning at all in the first place, it is less important now. :-) ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Christian Tismer wrote:
(snip)
Actually, I've been talking about this in various venues for about 4 years. :) This thread is the first time I've mentioned it the context of import. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
(snip)
I agree, however, I think that having: from M import v causing a name binding that is broken by local assigment to v *is* defensible and reasonably implementable. Changes to 'v' in M (including by reload of M) would be reflected locally unless someone did: v=something locally. Local assignment would negate an import, as it does now. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
[me]
Hm, but it still wouldn't have the same semantics as currently, and that's still a monster hiding under the bed until you're nearly asleep. Consider this example: # in M: verbose = 1 # in __main__: from M import verbose # somewhere else: M.verbose = 0 Under the current semantics, that would have no effect on verbose in __main__; but with your semantics it would. I think that is very hard to explain; even more so if you say that assigning a different value to __main__.verbose does not change M.verbose and furthermore breaks the connection. This means that if I add verbose = verbose to the __main__ code the semantics are different! I don't understand why you wanted these semantics in the first place. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
Agreed. Think Python 3000. I think that the semantics differ in boundary cases though.
Yup.
I'm suggesting a model where from "M import x" has a different meaning than it does now. I think the notion of sharing a name is useful. I'll admit that using "M.x" achieves the same thing, although at a higher performance cost (and, OK, typing cost ;).
I don't understand why you wanted these semantics in the first place.
First, let me say that this isn't super important to me. It does solve a problem with reload, which is the context in which I brought it up. Now, consider: from M import x ..... use(x) Many people would (wrongly) consider this to be equivalent to: import M ..... use(M.x) In fact, I'd *prefer* these to be equivalent even in the face of changes to M (e.g. reload). I'd prefer different semantics. Note that if I had: from M import x ..... x=y ..... use(x) I'd no longer exprect x to have any connection to M. Of course: x=x or x=M.x would be a bit more puzzling, but then they're meant to be. ;) They are addressed by a simple rule, which is that assignment in a module overrides imported name definition. Hm...ooh ooh A better solution would be to disallow assignments to imported names, as they are very likely to be errors. This could be detected without any fancy type inferencing. In fact, we could also decide to disallow an import-from to override an existing name binding. Ahhhhhh. :) In any case, I'd feel comfortable explaining a system in which from M import x # reference semantics wrt name had a different meaning from: import M x=M.x # copy semantics since I expect an attribute access to give me a value, not a name, whereas: from M import x seems more to me like it's talking about names. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/ed7a7/ed7a7726f92567bdedf8038ab1bf73ad9f78f348" alt=""
"JF" == Jim Fulton <jim@digicool.com> writes:
JF> I'm suggesting a model where from "M import x" has a different JF> meaning than it does now. I think the notion of sharing a name JF> is useful. I'll admit that using "M.x" achieves the same thing, JF> although at a higher performance cost (and, OK, typing cost ;). This seems to contradict the 2nd Pythonic principle: Explicit is better than implicit. I don't literally mean to argue that "The Python Way" should be used to make design decisions, but it captures exactly what makes me uncomfortable with the proposed change. [someone else, who could have been channeling me, wrote:]
I don't understand why you wanted these semantics in the first place.
JF> First, let me say that this isn't super important to me. It Glad to hear it <0.3 wink>! JF> does solve a problem with reload, which is the context in which JF> I brought it up. I don't think the reload problem is important enough to justify a change to name binding rules. [much omitted] JF> In any case, I'd feel comfortable explaining a system in which JF> from M import x # reference semantics wrt name JF> had a different meaning from: JF> import M x=M.x # copy semantics JF> since I expect an attribute access to give me a value, not a JF> name, whereas: JF> from M import x JF> seems more to me like it's talking about names. I think the proposed change muddies the semantics of assignment, and I would not feel comfortable trying to explain it. I don't have the same impression vis a vis import and names; I think this is why I disagree and have heretofore been puzzled about why you want this. Assignment binds a name to an object; import is just a variant of assignment. There is no need for a special case. One other worry: How would it create a copy of an object in general? How do you copy a class object or a file or a socket? Since (1) you can't restrict which types of objects are exported by a module and (2) there is no clear definition of copy that applies to any type of object, I don't see how these semantics could be defined. Jeremy
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Jeremy Hylton, on JimF's association objects wrt "from M import x" semantics]
Go ahead & argue it: they were *intended* to be used to guide design decisions! They were my best shot at summarizing what I've learned in a decade of (mostly successful) Guido-channeling. But note that I only listed 19 of the 20 Pythonic Theses: the 20th was left blank, for Guido to fill in however he likes whenever the other 19 suggest a direction he dislikes <wink>. Other relevant theses here include the ones about whether the implementation is, or isn't, easy to explain. I'm suffering an email backlog and haven't yet studied the latest batch on this topic, but a quick skim sure suggests that a concrete implementation isn't self-evident, and its implications perhaps downright subtle regardless. not-a-conclusion-just-a-concern-ly y'rs - tim
data:image/s3,"s3://crabby-images/c7ec1/c7ec163d000cdcd798bbd7cf347e45efd5dc8abb" alt=""
[Jim, Chris & Guido discussing a namespace idea] Guys, I'm lost. Please help me understanding this idea from the start. After rereading this whole thread, I have only a vague intuition of what Jim has proposed, but I fail to understand it; and believe me, I'm very interested in being in sync with you on the subject. Please filter the concept from the consequences and resubmit it once again (in english, through examples, ascii art, whatever). Thanks. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Howdy, lemme try... Vladimir Marangozov wrote:
Naming it different than before, I think this formulation hits the nail on its top: Jim proposes a construction that yields early binding of names, while late binding of values. Sample layout of an association object: <type, refcnt, key, value> with the same key semantics as for dicts. The key object is assigned when the association object is created, that is when the name is seen the first time. The value is still NULL until assigned. Now assume a namespace object as a collection of pointers to asso objects, and assume that it is only extended, nothing deleted. Then, a code object can refer to such a namespace object by giving the index of the asso object. Since the construction of these objects will occour in the same order after marshal/pickling, the offsets in a code object will always be correct. This means: There is a construction that allows to settle a name as soon as it is seen, without necessarily assigning a value. When a function at compile time sees a global, it tries to resolve it by finding an association object in the module's global scope that contains this name. If not found, it is created.
data:image/s3,"s3://crabby-images/c7ec1/c7ec163d000cdcd798bbd7cf347e45efd5dc8abb" alt=""
[Chris comes to my rescue on Jim's namespace idea] Christian Tismer wrote:
Ahaa. Got it. Thank you Chris! So naming is the same. Binding and name resolution are different. This is certainly a valuable idea in some foreseeble situations (like the globals pre-binding for a code object you're describing -- sort of a cache/array for globals, with initially invalidated entries). But the problem is that this indirection has so much power in it, that generalizing it to all namespaces seems to hide all kinds of surprises. I'm not in a position even to figure out what the implications could be (it smells "out of bounds"), but it certainly needs more digging. I suspect that if it turns out that these intermediate contexts cannot be generalized, their implementation may be compromised for the few identified cases where they are expected to be useful.
hoping it was clear enough - ciao - chris
Yes, but embracing it all is still a "so-so"... -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Jim Fulton]
Jim, I've been intrigued by this idea for all the years you've been suggesting it <wink>, but I've never understood what it is you're proposing! This is the Python-Dev list, so feel encouraged to present it in concrete implementation terms instead of ambiguous English. Or maybe an interface? interface a_kind_of_dictionary_that_is_a_collection_of_\ association_objects: # ??? beats me ... Or maybe as a C struct? For example, is "an association object" a (char*, PyObject*) pair? Does this kind of dictionary have keys? If so, of what type? What type are the values? Best I can make sense of the above, the values are "association objects", each of which contains a name and a value, and a key is maybe a duplicate of the name in the association object to which it maps. "A name" may or may not be a string -- I can't tell. Or maybe by "dictionary" you didn't intend Python's current meaning for that word at all. I assume "a value" is a PyObject*. The whole thrust *appears* to be to get names to map to a PyObject** instead of PyObject*, but if that's the ticket I don't know what association objeects have to do with it.
Where does the idea that 'spam' is a *module* here come from? It doesn't make sense to me, and I'm so lost I'll spare everyone my further confusions <wink>. suspecting-the-last-actually-doesn't-make-any-sense<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/b6ee0/b6ee01e5a3c0f3132bf9480b699184596d532f18" alt=""
Tim Peters wrote:
[Jim Fulton]
[association objects]
My guess is: An association object adds another level of indirection to namespaces and makes global variables be more like true variables, i.e. changing them in one place changes them everywhere.
I don't believe that the actual implementation matters too much and is still open to be choosen. Here my approach: Let an association object be a pair of a key and a value. The restrictions for keys may be the same as for dict keys. We can now either use dicts as they are, inserting asso-objects as values and sharing the key field, or invent new dictionaries which have no key/value pairs at all, but just references to asso-objects. In either case, we have the advantage that further references by global use from a function or by imports will always add to the asso-object, not to its value. This keeps the value changeable, like a list with one element, kind of boxed object. Since the asso-objects stay alive as long as they are referenced, they are never moved, and it is ok to refer to their address. For a function, this means that it can resolve a global at compile time. If the asso-object exists already, it has a fixed memory address and can be placed into the code object. If it does not exist, it can be created in the global dictionary or special asso-dictionary, whatever we'll use. The value will be NULL in this case, and this is perfect. If we do right, a value will have been inserted before the function is called, or we will raise a name error. The idea is simply to generate fixed slots for global names which never move. By mentioning the name, we create such a slot. The slot is alive as long it is seen, i.e. refcount > 0. There must be a general way to look these things up, either by the per-module dictionary, or by a specialized one. Finally I'd tend to do the latter, since those unitialized key/value asso-objects would give ambiguity what dict.keys() should be then. For consistency, I would hide all asso-objects in a special asso-collection per module. They could be placed into the modules dict, when their value becomes assigned first time. Alternatively, they are not created at compile time but at runtime, when a value is assigned. I'm not sure yet. Now, moving on from globals to all name spaces: If they are all handled by the asso-approach, can we use it to speed up attribute access for classes and instances? I guess we can! But I need more thought.
def swap_words(str, one, two): pieces = string.split(str, one) for i in range(len(pieces)): pieces[i] = string.replace(pieces[i], two, one) return string.join(pieces, two) sentence = swap_words(sentence, "'foo'", "'spam'") ciao - chris -- Christian Tismer :^) <mailto:tismer@appliedbiometrics.com> Applied Biometrics GmbH : Have a break! Take a ride on Python's Düppelstr. 31 : *Starship* http://starship.python.net 12163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Christian Tismer wrote:
Ugh. Sorry to make you guess....
Right.
Right, Replace dictentries with association object pointers.
In either case, we have the advantage that further references by global use from a function or by imports will always add
to the refcount of
Yup.
exactly. You are a great guesser! :)
Yes, it needs more thought.
Ooh ooh, you've invented a 'Jim translator bot'! Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Tim Peters wrote:
I just responded to Guido in a bit more detail. Hopefully, this will be of sufficient clarity. If not, then I'll be happy to work up a Python demonstration.
Does this kind of dictionary have keys?
Yes.
If so, of what type?
Whatever you want. Just like a dictionary.
What type are the values?
ditto.
Sorry. See my reply to Guido and let me know if I'm still being too vague.
participants (12)
-
A.M. Kuchling
-
Barry A. Warsaw
-
Christian Tismer
-
Greg Stein
-
Guido van Rossum
-
Guido van Rossum
-
Jeremy Hylton
-
Jim Fulton
-
Paul Prescod
-
skaller
-
Tim Peters
-
Vladimir Marangozov