Mailman 3 PEP 231, __findattr__() - Python-Dev

newer
unit testing and Python regression...

PEP 231, findattr()

older
Re: [Python-Dev] A house upon the...

Martin v. Loewis

Dec. 3, 2000

9:56 p.m.

Who is "we" here? The Python code implementing __findattr__? How would it pass a value to __setattr__? It doesn't call __setattr__, instead it has "self.__myfoo = x"... I agree that the current implementation is not thread-safe. To solve that, you'd need to associate with each instance not a single "infindattr" attribute, but a whole set of them - one per "thread of execution" (which would be a thread-id in most threading systems). Of course, that would need some cooperation from the any thread scheme (including uthreads), which would need to provide an identification for a "calling context". Regards, Martin

Show replies by date

Christian Tismer

December 2000

9:38 p.m.

"Martin v. Loewis" wrote:

...

Ouch - right! Sorry :)

...

Right, that is one possible way to do it. I also thought about some alternatives, but they all sound too complicated to justify them. Also I don't think this is only thread-related, since mess can happen even with an explicit coroutine jmp. Furthermore, how to deal with multiple attribute names? The function works wrong if __findattr__ tries to inspect another attribute. IMO, the state of the current interpreter changes here (or should do so), and this changed state needs to be carried down with all subsequent function calls. confused - ly chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

barry＠digicool.com

3:13 p.m.

...

...
...
...
...
"MvL" == Martin v Loewis <martin@loewis.home.cs.tu-berlin.de> writes:

MvL> I agree that the current implementation is not MvL> thread-safe. To solve that, you'd need to associate with each MvL> instance not a single "infindattr" attribute, but a whole set MvL> of them - one per "thread of execution" (which would be a MvL> thread-id in most threading systems). Of course, that would MvL> need some cooperation from the any thread scheme (including MvL> uthreads), which would need to provide an identification for MvL> a "calling context". I'm still catching up on several hundred emails over the weekend. I had a sneaking suspicion that infindattr wasn't thread-safe, so I'm convinced this is a bug in the implementation. One approach might be to store the info in the thread state object (isn't that how the recursive repr stop flag is stored?) That would also save having to allocate an extra int for every instance (yuck) but might impose a bit more of a performance overhead. I'll work more on this later today. -Barry

Martin v. Loewis

11:10 p.m.

...

Whether this works depends on how exactly the info is stored. A single flag won't be sufficient, since multiple objects may have __findattr__ in progress in a given thread. With a set of instances, it would work, though. Regards, Martin

Moshe Zadka

3:31 a.m.

...

I don't think this is a good idea -- continuations and coroutines might mess it up. Maybe the right thing is to mess with the *compilation* of __findattr__ so that it would call __setattr__ and __getattr__ with special flags that stop them from calling __findattr__? This is ugly, but I can't think of a better way. -- Moshe Zadka <sig@zadka.site.co.il> This is a signature anti-virus. Please stop the spread of signature viruses!

Christian Tismer

6:35 p.m.

Moshe Zadka wrote:

...

Yeah, this is what I tried to say by "different machine state"; compiling different behavior in the case of a special method is an interesting idea. It is limited somewhat, since the changed system state is not inherited to called functions. But if __findattr__ performs its one, single task in its body alone, we are fine. still-thinking-of-alternatives - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

barry＠digicool.com

9:23 p.m.

...

...
...
...
...
"CT" == Christian Tismer <tismer@tismer.com> writes:

CT> You want most probably do this: __findattr__ should not be CT> invoked again for this instance, with this attribute name, for CT> this "thread", until you are done. First, I think the rule should be "__findattr__ should not be invoked again for this instance, in this thread, until you are done". I.e. once in __findattr__, you want all subsequent attribute references to bypass findattr, because presumably, your instance now has complete control for all accesses in this thread. You don't want to limit it to just the currently named attribute. Second, if "this thread" is defined as _PyThreadState_Current, then we have a simple solution, as I mapped out earlier. We do a PyThreadState_GetDict() and store the instance in that dict on entry to __findattr__ and remove it on exit from __findattr__. If the instance can be found in the current thread's dict, we bypass __findattr__.

...

...
...
...
...
"MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

MZ> I don't think this is a good idea -- continuations and MZ> coroutines might mess it up. You might be right, but I'm not sure. If we make __findattr__ thread safe according to the definition above, and if uthread/coroutine/continuation safety can be accomplished by the __findattr__ programmer's discipline, then I think that is enough. IOW, if we can tell the __findattr__ author to not relinquish the uthread explicitly during the __findattr__ call, we're cool. Oh, and as long as we're not somehow substantially reducing the utility of __findattr__ by making that restriction. What I worry about is re-entrancy that isn't under the programmer's control, like the Real Thread-safety problem. -Barry

Christian Tismer

9:35 p.m.

"Barry A. Warsaw" wrote:

...

Maybe this is better. Surely easier. :) [ThreadState solution - well fine so far]

...

Hmm. WHat do you think about Moshe's idea to change compiling of the method? It has the nice advantage that there are no Thread-safety problems by design. The only drawback is that the contract of not-calling-myself only holds for this function. I don't know how Threadstate scale up when there are more things like these invented. Well, for the moment, the simple solution with Stackless would just be to let the interpreter recurse in this call, the same as it happens during __init__ and anything else that isn't easily turned into tail-recursion. It just blocks :-) ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

barry＠digicool.com

10:58 p.m.

...

...
...
...
...
"CT" == Christian Tismer <tismer@tismer.com> writes:

CT> Hmm. WHat do you think about Moshe's idea to change compiling CT> of the method? It has the nice advantage that there are no CT> Thread-safety problems by design. The only drawback is that CT> the contract of not-calling-myself only holds for this CT> function. I'm not sure I understand what Moshe was proposing. Moshe: are you saying that we should change the way the compiler works, so that it somehow recognizes this special case? I'm not sure I like that approach. I think I want something more runtime-y, but I'm not sure why (maybe just because I'm more comfortable mucking about in the run-time than in the compiler). -Barry

Martin v. Loewis

11:19 p.m.

...

I guess you are also uncomfortable with the problem that the compile-time analysis cannot "see" through levels of indirection. E.g. if findattr as return self.compute_attribute(real_attribute) then compile-time analysis could figure out to call compute_attribute directly. However, that method may be implemented as def compute_attribute(self,name): return self.mapping[name] where the access to mapping could not be detected statically. Regards, Martin

Guido van Rossum

11:16 p.m.

I'm unconvinced by the __findattr__ proposal as it now stands. - Do you really think that JimF would do away with ExtensionClasses if __findattr__ was intruduced? I kinda doubt it. See [*footnote]. It seems that *using* __findattr__ is expensive (even if *not* using is cheap :-). - Why is deletion not supported? What if you want to enforce a policy on deletions too? - It's ugly to use the same call for get and set. The examples indicate that it's not such a great idea: every example has *two* tests whether it's get or set. To share a policy, the proper thing to do is to write a method that either get or set can use. - I think it would be sufficient to *only* use __findattr__ for getattr -- __setattr__ and __delattr__ already have full control. The "one routine to implement the policy" argument doesn't really hold, I think. - The PEP says that the "in-findattr" flag is set on the instance. We've already determined that this is not thread-safe. This is not just a bug in the implementation -- it's a bug in the specification. I also find it ugly. But if we decide to do this, it can go in the thread-state -- if we ever add coroutines, we have to decide on what stuff to move from the thread state to the coroutine state anyway. - It's also easy to conceive situations where recursive __findattr__ calls on the same instance in the same thread/coroutine are perfectly desirable -- e.g. when __findattr__ ends up calling a method that uses a lot of internal machinery of the class. You don't want all the machinery to have to be aware of the fact that it may be called with __findattr__ on the stack and without it. So perhaps it may be better to only treat the body of __findattr__ itself special, as Moshe suggested. What does Jython do here? - The code examples require a *lot* of effort to understand. These are complicated issues! (I rewrote the Bean example using __getattr__ and __setattr__ and found no need for __findattr__; the __getattr__ version is simpler and easier to understand. I'm still studying the other __findattr__ examples.) - The PEP really isn't that long, except for the code examples. I recommend reading the patch first -- the patch is probably shorter than any specification of the feature can be. --Guido van Rossum (home page: http://www.python.org/~guido/) [*footnote] There's an easy way (that few people seem to know) to cause __getattr__ to be called for virtually all attribute accesses: put *all* (user-visible) attributes in a sepate dictionary. If you want to prevent access to this dictionary too (for Zope security enforcement), make it a global indexed by id() -- a destructor(__del__) can take care of deleting entries here.

barry＠digicool.com

2:54 a.m.

...

...
...
...
...
"GvR" == Guido van Rossum <guido@python.org> writes:

GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-). That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful. GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too? It could be, without much work. GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use. I don't have strong feelings either way. GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think. What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :) GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway. Right. That's where we've ended up in subsequent messages on this thread. GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it. Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it. And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever. GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested. Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this. GvR> What does Jython do here? It's not exactly equivalent, because Jython's __findattr__ can't call back into Python. GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.) Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example). The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code. GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be. Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think. GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here. Presumably that'd be a module global, right? Maybe within Zope that could be protected, but outside of that, that global's always going to be accessible. So are methods, even if given private names. And I don't think that such code would be any more readable since instead of self.name you'd see stuff like def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ... def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ... and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection. You're probably still unconvinced <wink>, but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation. -Barry

Guido van Rossum

12:54 p.m.

...

...
...
...
...
...
"GvR" == Guido van Rossum <guido@python.org> writes:

GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-).

That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful.

Agreed. But I'm still hoping to find the silver bullet that lets Jim (and everybody else) do what ExtensionClass does without needing another extension.

...

GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too?

It could be, without much work.

Then it should be -- except I prefer to do only getattr anyway, see below.

...

GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use.

I don't have strong feelings either way.

What does Jython do? I thought it only did set (hence the name :-). I think there's no *need* for findattr to catch the setattr operation, because __setattr__ *already* gets invoked on each set not just ones where the attr doesn't yet exist.

...

GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think.

What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :)

Aha! You got me there. Clearly the REAL reason for wanting __findattr__ is the no-recursive-calls rule -- which is also the most uncooked feature... Traditional getattr hooks don't need this as much because they don't get called when the attribute already exists; traditional setattr hooks deal with it by switching on the attribute name. The no-recursive-calls rule certainly SEEMS an attractive way around this. But I'm not sure that it really is... I need to get my head around this more. (The only reason I'm still posting this reply is to test the new mailing lists setup via mail.python.org.)

...

GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway.

Right. That's where we've ended up in subsequent messages on this thread.

GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it.

Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it.

But the class is also a client of itself, and not all cases where it is a client of itself are inside a findattr call. Take your bean example. Suppose your bean class also has a spam() method. The findattr code needs to account for this, e.g.: def __findattr__(self, name, *args): if name == "spam" and not args: return self.spam ...original body here... Or you have to add a _get_spam() method: def _get_spam(self): return self.spam Either solution gets tedious if there ar a lot of methods; instead, findattr could check if the attr is defined on the class, and then return that: def __findattr__(self, name, *args): if not args and name[0] != '_' and hasattr(self.__class__, name): return getattr(self, name) ...original body here... Anyway, let's go back to the spam method. Suppose it references self.foo. The findattr machinery will access it. Fine. But now consider another attribute (bar) with _set_bar() and _get_bar() methods that do a little more. Maybe bar is really calculated from the value of self.foo. Then _get_bar cannot use self.foo (because it's inside findattr so findattr won't resolve it, and self.foo doesn't actually exist on the instance) so it has to use self.__myfoo. Fine -- after all this is inside a _get_* handler, which knows it's being called from findattr. But what if, instead of needing self.foo, _get_bar wants to call self.spam() in order? Then self.spam() is being called from inside findattr, so when it access self.foo, findattr isn't used -- and it fails with an AttributeError! Sorry for the long detour, but *that's* the problem I was referring to. I think the scenario is quite realistic.

...

And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever.

Actually, this is usually solved by having __setattr__ check for specific names only, and for others do self.__dict__[name] = value; that way, recursive __setattr__ calls are okay. Similar for __getattr__ (which has to raise AttributeError for unrecognized names).

...

GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested.

Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this.

Read Moshe's messages (and Martin's replies) again. I don't care that much for it so I won't explain it again.

...

GvR> What does Jython do here?

It's not exactly equivalent, because Jython's __findattr__ can't call back into Python.

I'd say that Jython's __findattr__ is an entirely different beast than what we have here. Its min purpose in life appears to be to be a getattr equivalent that returns NULL instead of raising an exception when the attribute isn't found -- which is reasonable because from within Java, testing for null is much cheaper than checking for an exception, and you often need to look whether a given attribute exists do some default action if not. (In fact, I'd say that CPython could also use a findattr of this kind...) This is really too bad. Based on the name similarity and things I thought you'd said in private before, I thought that they would be similar. Then the experience with Jython would be a good argument for adding a findattr hook to CPython. But now that they are totally different beasts it doesn't help at all.

...

GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.)

Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example).

Here's my getattr example. It's more lines of code, but cleaner IMHO: class Bean: def __init__(self, x): self.__myfoo = x def __isprivate(self, name): return name.startswith('_') def __getattr__(self, name): if self.__isprivate(name): raise AttributeError, name return getattr(self, "_get_" + name)() def __setattr__(self, name, value): if self.__isprivate(name): self.__dict__[name] = value else: return getattr(self, "_set_" + name)(value) def _set_foo(self, x): self.__myfoo = x def _get_foo(self): return self.__myfoo b = Bean(3) print b.foo b.foo = 9 print b.foo

...

The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code.

I *still* have to study the examples... :-( Will do next.

...

GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be.

Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think.

No, my point is that the examples need more explanation. Right now the EC example is over 200 lines of brain-exploding code! :-)

...

GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here.

Presumably that'd be a module global, right? Maybe within Zope that could be protected,

Yes.

...

but outside of that, that global's always going to be accessible. So are methods, even if given private names.

Aha! Another think that I expect has been on your agenda for a long time, but which isn't explicit in the PEP (AFAICT): findattr gives *total* control over attribute access, unlike __getattr__ and __setattr__ and private name mangling, which can all be defeated. And this may be one of the things that Jim is after with ExtensionClasses in Zope. Although I believe that in DTML, he doesn't trust this: he uses source-level (or bytecode-level) transformations to turn all X.Y operations into a call into a security manager. So I'm not sure that the argument is very strong.

...

And I don't think that such code would be any more readable since instead of self.name you'd see stuff like

def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ...

def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ...

and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection.

Fair enough. I withdraw the suggestion, and propose restricted execution instead. There, you can use Bastions -- which have problems of their own, but you do get total control.

...

You're probably still unconvinced <wink>, but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation.

I am dead-set against introducing a feature that I don't fully understand. Let's continue this discussion. --Guido van Rossum (home page: http://www.python.org/~guido/)

bckfnn＠worldonline.dk

3:40 p.m.

On Tue, 05 Dec 2000 07:54:20 -0500, you wrote:

...

Correct. It is also the method to override when making a new builtin type and it will be called on such a type subclass regardless of the presence of any __getattr__ hook and __dict__ content. So I think it have some of the properties which Barry wants. regards, finn

barry＠digicool.com

2:20 a.m.

...

...
...
...
...
"FB" == Finn Bock <bckfnn@worldonline.dk> writes:

FB> Correct. It is also the method to override when making a new FB> builtin type and it will be called on such a type subclass FB> regardless of the presence of any __getattr__ hook and FB> __dict__ content. So I think it have some of the properties FB> which Barry wants. We had a discussion about this PEP at our group meeting today. Rather than write it all twice, I'm going to try to update the PEP and patch tonight. I think what we came up with will solve most of the problems raised, and will be implementable in Jython (I'll try to work up a Jython patch too, if I don't fall asleep first :) -Barry

greg＠cosc.canterbury.ac.nz

11:07 p.m.

New subject: Are you all mad? (Re: PEP 231, __findattr__())

I can't believe you're even considering a magic dynamically-scoped flag that invisibly changes the semantics of fundamental operations. To me the idea is utterly insane! If I understand correctly, the problem is that if you do something like def __findattr__(self, name): if name == 'spam': return self.__dict__['spam'] then self.__dict__ is going to trigger a recursive __findattr__ call. It seems to me that if you're going to have some sort of hook that is always called on any x.y reference, you need some way of explicitly bypassing it and getting at the underlying machinery. I can think of a couple of ways: 1) Make the __dict__ attribute special, so that accessing it always bypasses __findattr__. 2) Provide some other way of getting direct access to the attributes of an object, e.g. new builtins called peekattr() and pokeattr(). This assumes that you always know when you write a particular access whether you want it to be a "normal" or "special" one, so that you can use the appropriate mechanism. Are there any cases where this is not true? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

barry＠digicool.com

2:54 a.m.

New subject: Are you all mad? (Re: PEP 231, __findattr__())

...

...
...
...
...
"greg" == <greg@cosc.canterbury.ac.nz> writes:

| 1) Make the __dict__ attribute special, so that accessing | it always bypasses __findattr__. You're not far from what I came up with right after our delicious lunch. We're going to invent a new protocol which passes __dict__ into the method as an argument. That way self.__dict__ doesn't need to be special cased at all because you can get at all the attributes via a local! So no recursion stop hack is necessary. More in the updated PEP and patch. -Barry

Martin v. Loewis

11:13 p.m.

...

I don't think this is a good idea -- continuations and coroutines might mess it up.

If coroutines and continuations present operate preemptively, then they should present themselves as an implementation of the thread API; perhaps the thread API needs to be extended to allow for such a feature. If yielding control is in the hands of the implementation, it would be easy to outrule a context switch while findattr is in progress. Regards, Martin

Christian Tismer

December 2000

9:38 p.m.

"Martin v. Loewis" wrote:

...

Ouch - right! Sorry :)

...

barry＠digicool.com

3:13 p.m.

...

...
...
...
...
"MvL" == Martin v Loewis <martin@loewis.home.cs.tu-berlin.de> writes:

Martin v. Loewis

11:10 p.m.

...

Moshe Zadka

3:31 a.m.

...

Christian Tismer

6:35 p.m.

Moshe Zadka wrote:

...

barry＠digicool.com

9:23 p.m.

...

...
...
...
...
"CT" == Christian Tismer <tismer@tismer.com> writes:

...

...
...
...
...
"MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

Christian Tismer

December 2000

9:35 p.m.

"Barry A. Warsaw" wrote:

...

Maybe this is better. Surely easier. :) [ThreadState solution - well fine so far]

...

barry＠digicool.com

10:58 p.m.

...

...
...
...
...
"CT" == Christian Tismer <tismer@tismer.com> writes:

Martin v. Loewis

11:19 p.m.

...

Guido van Rossum

11:16 p.m.

barry＠digicool.com

2:54 a.m.

...

...
...
...
...
"GvR" == Guido van Rossum <guido@python.org> writes:

Guido van Rossum

12:54 p.m.

...

...
...
...
...
...
"GvR" == Guido van Rossum <guido@python.org> writes:

GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-).

That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful.

Agreed. But I'm still hoping to find the silver bullet that lets Jim (and everybody else) do what ExtensionClass does without needing another extension.

...

GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too?

It could be, without much work.

Then it should be -- except I prefer to do only getattr anyway, see below.

...

GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use.

I don't have strong feelings either way.

...

GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think.

What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :)

...

GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway.

Right. That's where we've ended up in subsequent messages on this thread.

GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it.

Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it.

...

And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever.

...

GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested.

Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this.

Read Moshe's messages (and Martin's replies) again. I don't care that much for it so I won't explain it again.

...

GvR> What does Jython do here?

It's not exactly equivalent, because Jython's __findattr__ can't call back into Python.

...

GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.)

Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example).

...

The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code.

I *still* have to study the examples... :-( Will do next.

...

GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be.

Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think.

No, my point is that the examples need more explanation. Right now the EC example is over 200 lines of brain-exploding code! :-)

...

GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here.

Presumably that'd be a module global, right? Maybe within Zope that could be protected,

Yes.

...

but outside of that, that global's always going to be accessible. So are methods, even if given private names.

...

And I don't think that such code would be any more readable since instead of self.name you'd see stuff like

def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ...

def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ...

and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection.

Fair enough. I withdraw the suggestion, and propose restricted execution instead. There, you can use Bastions -- which have problems of their own, but you do get total control.

...

You're probably still unconvinced <wink>, but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation.

I am dead-set against introducing a feature that I don't fully understand. Let's continue this discussion. --Guido van Rossum (home page: http://www.python.org/~guido/)