"GvR" == Guido van Rossum <guido@python.org> writes:
GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-).
That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful.
Agreed. But I'm still hoping to find the silver bullet that lets Jim (and everybody else) do what ExtensionClass does without needing another extension.
GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too?
It could be, without much work.
Then it should be -- except I prefer to do only getattr anyway, see below.
GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use.
I don't have strong feelings either way.
What does Jython do? I thought it only did set (hence the name :-). I think there's no *need* for findattr to catch the setattr operation, because __setattr__ *already* gets invoked on each set not just ones where the attr doesn't yet exist.
GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think.
What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :)
Aha! You got me there. Clearly the REAL reason for wanting __findattr__ is the no-recursive-calls rule -- which is also the most uncooked feature... Traditional getattr hooks don't need this as much because they don't get called when the attribute already exists; traditional setattr hooks deal with it by switching on the attribute name. The no-recursive-calls rule certainly SEEMS an attractive way around this. But I'm not sure that it really is... I need to get my head around this more. (The only reason I'm still posting this reply is to test the new mailing lists setup via mail.python.org.)
GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway.
Right. That's where we've ended up in subsequent messages on this thread.
GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it.
Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it.
But the class is also a client of itself, and not all cases where it is a client of itself are inside a findattr call. Take your bean example. Suppose your bean class also has a spam() method. The findattr code needs to account for this, e.g.: def __findattr__(self, name, *args): if name == "spam" and not args: return self.spam ...original body here... Or you have to add a _get_spam() method: def _get_spam(self): return self.spam Either solution gets tedious if there ar a lot of methods; instead, findattr could check if the attr is defined on the class, and then return that: def __findattr__(self, name, *args): if not args and name[0] != '_' and hasattr(self.__class__, name): return getattr(self, name) ...original body here... Anyway, let's go back to the spam method. Suppose it references self.foo. The findattr machinery will access it. Fine. But now consider another attribute (bar) with _set_bar() and _get_bar() methods that do a little more. Maybe bar is really calculated from the value of self.foo. Then _get_bar cannot use self.foo (because it's inside findattr so findattr won't resolve it, and self.foo doesn't actually exist on the instance) so it has to use self.__myfoo. Fine -- after all this is inside a _get_* handler, which knows it's being called from findattr. But what if, instead of needing self.foo, _get_bar wants to call self.spam() in order? Then self.spam() is being called from inside findattr, so when it access self.foo, findattr isn't used -- and it fails with an AttributeError! Sorry for the long detour, but *that's* the problem I was referring to. I think the scenario is quite realistic.
And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever.
Actually, this is usually solved by having __setattr__ check for specific names only, and for others do self.__dict__[name] = value; that way, recursive __setattr__ calls are okay. Similar for __getattr__ (which has to raise AttributeError for unrecognized names).
GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested.
Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this.
Read Moshe's messages (and Martin's replies) again. I don't care that much for it so I won't explain it again.
GvR> What does Jython do here?
It's not exactly equivalent, because Jython's __findattr__ can't call back into Python.
I'd say that Jython's __findattr__ is an entirely different beast than what we have here. Its min purpose in life appears to be to be a getattr equivalent that returns NULL instead of raising an exception when the attribute isn't found -- which is reasonable because from within Java, testing for null is much cheaper than checking for an exception, and you often need to look whether a given attribute exists do some default action if not. (In fact, I'd say that CPython could also use a findattr of this kind...) This is really too bad. Based on the name similarity and things I thought you'd said in private before, I thought that they would be similar. Then the experience with Jython would be a good argument for adding a findattr hook to CPython. But now that they are totally different beasts it doesn't help at all.
GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.)
Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example).
Here's my getattr example. It's more lines of code, but cleaner IMHO: class Bean: def __init__(self, x): self.__myfoo = x def __isprivate(self, name): return name.startswith('_') def __getattr__(self, name): if self.__isprivate(name): raise AttributeError, name return getattr(self, "_get_" + name)() def __setattr__(self, name, value): if self.__isprivate(name): self.__dict__[name] = value else: return getattr(self, "_set_" + name)(value) def _set_foo(self, x): self.__myfoo = x def _get_foo(self): return self.__myfoo b = Bean(3) print b.foo b.foo = 9 print b.foo
The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code.
I *still* have to study the examples... :-( Will do next.
GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be.
Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think.
No, my point is that the examples need more explanation. Right now the EC example is over 200 lines of brain-exploding code! :-)
GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here.
Presumably that'd be a module global, right? Maybe within Zope that could be protected,
Yes.
but outside of that, that global's always going to be accessible. So are methods, even if given private names.
Aha! Another think that I expect has been on your agenda for a long time, but which isn't explicit in the PEP (AFAICT): findattr gives *total* control over attribute access, unlike __getattr__ and __setattr__ and private name mangling, which can all be defeated. And this may be one of the things that Jim is after with ExtensionClasses in Zope. Although I believe that in DTML, he doesn't trust this: he uses source-level (or bytecode-level) transformations to turn all X.Y operations into a call into a security manager. So I'm not sure that the argument is very strong.
And I don't think that such code would be any more readable since instead of self.name you'd see stuff like
def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ...
def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ...
and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection.
Fair enough. I withdraw the suggestion, and propose restricted execution instead. There, you can use Bastions -- which have problems of their own, but you do get total control.
You're probably still unconvinced <wink>, but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation.
I am dead-set against introducing a feature that I don't fully understand. Let's continue this discussion. --Guido van Rossum (home page: http://www.python.org/~guido/)