[Python-Dev] PEP 231, findattr()

Tue, 05 Dec 2000 07:54:20 -0500

> >>>>> "GvR" == Guido van Rossum <guido@python.org> writes:
> 
>     GvR> - Do you really think that JimF would do away with
>     GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
>     GvR> doubt it.  See [*footnote].  It seems that *using*
>     GvR> __findattr__ is expensive (even if *not* using is cheap :-).
> 
> That's not even the real reason why JimF wouldn't stop using
> ExtensionClass.  He's already got too much code invested in EC.
> However EC can be a big pill to swallow for some applications because
> it's a C extension (and because it has some surprising non-Pythonic
> side effects).  In those situations, a pure Python approach, even
> though slower, is useful.

Agreed.  But I'm still hoping to find the silver bullet that lets Jim
(and everybody else) do what ExtensionClass does without needing
another extension.

>     GvR> - Why is deletion not supported?  What if you want to enforce
>     GvR> a policy on deletions too?
> 
> It could be, without much work.

Then it should be -- except I prefer to do only getattr anyway, see
below.

>     GvR> - It's ugly to use the same call for get and set.  The
>     GvR> examples indicate that it's not such a great idea: every
>     GvR> example has *two* tests whether it's get or set.  To share a
>     GvR> policy, the proper thing to do is to write a method that
>     GvR> either get or set can use.
> 
> I don't have strong feelings either way.

What does Jython do?  I thought it only did set (hence the name :-).
I think there's no *need* for findattr to catch the setattr operation,
because __setattr__ *already* gets invoked on each set not just ones
where the attr doesn't yet exist.

>     GvR> - I think it would be sufficient to *only* use __findattr__
>     GvR> for getattr -- __setattr__ and __delattr__ already have full
>     GvR> control.  The "one routine to implement the policy" argument
>     GvR> doesn't really hold, I think.
> 
> What about the ability to use "normal" x.name attribute access syntax
> inside the hook?  Let me guess your answer. :)

Aha!  You got me there.  Clearly the REAL reason for wanting
__findattr__ is the no-recursive-calls rule -- which is also the most
uncooked feature...  Traditional getattr hooks don't need this as much
because they don't get called when the attribute already exists;
traditional setattr hooks deal with it by switching on the attribute
name.  The no-recursive-calls rule certainly SEEMS an attractive way
around this.  But I'm not sure that it really is...

I need to get my head around this more.  (The only reason I'm still
posting this reply is to test the new mailing lists setup via
mail.python.org.)

>     GvR> - The PEP says that the "in-findattr" flag is set on the
>     GvR> instance.  We've already determined that this is not
>     GvR> thread-safe.  This is not just a bug in the implementation --
>     GvR> it's a bug in the specification.  I also find it ugly.  But
>     GvR> if we decide to do this, it can go in the thread-state -- if
>     GvR> we ever add coroutines, we have to decide on what stuff to
>     GvR> move from the thread state to the coroutine state anyway.
> 
> Right.  That's where we've ended up in subsequent messages on this thread.
> 
>     GvR> - It's also easy to conceive situations where recursive
>     GvR> __findattr__ calls on the same instance in the same
>     GvR> thread/coroutine are perfectly desirable -- e.g. when
>     GvR> __findattr__ ends up calling a method that uses a lot of
>     GvR> internal machinery of the class.  You don't want all the
>     GvR> machinery to have to be aware of the fact that it may be
>     GvR> called with __findattr__ on the stack and without it.
> 
> Hmm, okay, I don't really understand your example.  I suppose I'm
> envisioning __findattr__ as a way to provide an interface to clients
> of the class.  Maybe it's a bean interface, maybe it's an acquisition
> interface or an access control interface.  The internal machinery has
> to know something about how that interface is implemented, so whether
> __findattr__ is recursive or not doesn't seem to enter into it.

But the class is also a client of itself, and not all cases where it
is a client of itself are inside a findattr call.  Take your bean
example.  Suppose your bean class also has a spam() method.  The
findattr code needs to account for this, e.g.:

    def __findattr__(self, name, *args):
	if name == "spam" and not args:
	    return self.spam
	...original body here...

Or you have to add a _get_spam() method:

    def _get_spam(self):
	return self.spam

Either solution gets tedious if there ar a lot of methods; instead,
findattr could check if the attr is defined on the class, and then
return that:

    def __findattr__(self, name, *args):
        if not args and name[0] != '_' and hasattr(self.__class__, name):
	    return getattr(self, name)
	...original body here...

Anyway, let's go back to the spam method.  Suppose it references
self.foo.  The findattr machinery will access it.  Fine.  But now
consider another attribute (bar) with _set_bar() and _get_bar()
methods that do a little more.  Maybe bar is really calculated from
the value of self.foo.  Then _get_bar cannot use self.foo (because
it's inside findattr so findattr won't resolve it, and self.foo
doesn't actually exist on the instance) so it has to use self.__myfoo.
Fine -- after all this is inside a _get_* handler, which knows it's
being called from findattr.  But what if, instead of needing self.foo,
_get_bar wants to call self.spam() in order?  Then self.spam() is
being called from inside findattr, so when it access self.foo,
findattr isn't used -- and it fails with an AttributeError!

Sorry for the long detour, but *that's* the problem I was referring
to.  I think the scenario is quite realistic.

> And also, allowing __findattr__ to be recursive will just impose
> different constraints on the internal machinery methods, just like
> __setattr__ currently does.  I.e. you better know that you're in
> __setattr__ and not do self.name type things, or you'll recurse
> forever. 

Actually, this is usually solved by having __setattr__ check for
specific names only, and for others do self.__dict__[name] = value;
that way, recursive __setattr__ calls are okay.  Similar for
__getattr__ (which has to raise AttributeError for unrecognized
names).

>     GvR> So perhaps it may be better to only treat the body of
>     GvR> __findattr__ itself special, as Moshe suggested.
> 
> Maybe I'm being dense, but I'm not sure exactly what this means, or
> how you would do this.

Read Moshe's messages (and Martin's replies) again.  I don't care that
much for it so I won't explain it again.

>     GvR> What does Jython do here?
> 
> It's not exactly equivalent, because Jython's __findattr__ can't call
> back into Python.

I'd say that Jython's __findattr__ is an entirely different beast than
what we have here.  Its min purpose in life appears to be to be a
getattr equivalent that returns NULL instead of raising an exception
when the attribute isn't found -- which is reasonable because from
within Java, testing for null is much cheaper than checking for an
exception, and you often need to look whether a given attribute exists
do some default action if not.  (In fact, I'd say that CPython could
also use a findattr of this kind...)

This is really too bad.  Based on the name similarity and things I
thought you'd said in private before, I thought that they would be
similar.  Then the experience with Jython would be a good argument for
adding a findattr hook to CPython.  But now that they are totally
different beasts it doesn't help at all.

>     GvR> - The code examples require a *lot* of effort to understand.
>     GvR> These are complicated issues!  (I rewrote the Bean example
>     GvR> using __getattr__ and __setattr__ and found no need for
>     GvR> __findattr__; the __getattr__ version is simpler and easier
>     GvR> to understand.  I'm still studying the other __findattr__
>     GvR> examples.)
> 
> Is it simpler because you separated out the set and get behavior?  If
> __findattr__ only did getting, I think it would be a lot similar too
> (but I'd still be interested in seeing your __getattr__ only
> example).

Here's my getattr example.  It's more lines of code, but cleaner IMHO:

    class Bean:
	def __init__(self, x):
	    self.__myfoo = x

	def __isprivate(self, name):
	    return name.startswith('_')

	def __getattr__(self, name):
	    if self.__isprivate(name):
		raise AttributeError, name
	    return getattr(self, "_get_" + name)()

	def __setattr__(self, name, value):
	    if self.__isprivate(name):
		self.__dict__[name] = value
	    else:
		return getattr(self, "_set_" + name)(value)

	def _set_foo(self, x):
	    self.__myfoo = x

	def _get_foo(self):
	    return self.__myfoo

    b = Bean(3)
    print b.foo
    b.foo = 9
    print b.foo

> The acquisition examples are complicated because I wanted
> to support the same interface that EC's acquisition classes support.
> All that detail isn't necessary for example code.

I *still* have to study the examples... :-(  Will do next.

>     GvR> - The PEP really isn't that long, except for the code
>     GvR> examples.  I recommend reading the patch first -- the patch
>     GvR> is probably shorter than any specification of the feature can
>     GvR> be.
> 
> Would it be more helpful to remove the examples?  If so, where would
> you put them?  It's certainly useful to have examples someplace I
> think.

No, my point is that the examples need more explanation.  Right now
the EC example is over 200 lines of brain-exploding code! :-)

>     GvR>   There's an easy way (that few people seem to know) to cause
>     GvR> __getattr__ to be called for virtually all attribute
>     GvR> accesses: put *all* (user-visible) attributes in a sepate
>     GvR> dictionary.  If you want to prevent access to this dictionary
>     GvR> too (for Zope security enforcement), make it a global indexed
>     GvR> by id() -- a destructor(__del__) can take care of deleting
>     GvR> entries here.
> 
> Presumably that'd be a module global, right?  Maybe within Zope that
> could be protected,

Yes.

> but outside of that, that global's always going to
> be accessible.  So are methods, even if given private names.

Aha!  Another think that I expect has been on your agenda for a long
time, but which isn't explicit in the PEP (AFAICT): findattr gives
*total* control over attribute access, unlike __getattr__ and
__setattr__ and private name mangling, which can all be defeated.

And this may be one of the things that Jim is after with
ExtensionClasses in Zope.  Although I believe that in DTML, he doesn't
trust this: he uses source-level (or bytecode-level) transformations
to turn all X.Y operations into a call into a security manager.

So I'm not sure that the argument is very strong.

> And I
> don't think that such code would be any more readable since instead of
> self.name you'd see stuff like
> 
>     def __getattr__(self, name):
>         global instdict
> 	mydict = instdict[id(self)]
> 	obj = mydict[name]
> 	...
> 
>     def __setattr__(self, name, val):
> 	global instdict
> 	mydict = instdict[id(self)]
> 	instdict[name] = val
> 	...
> 
> and that /might/ be a problem with Jython currently, because id()'s
> may be reused.  And relying on __del__ may have unfortunate side
> effects when viewed in conjunction with garbage collection.

Fair enough.  I withdraw the suggestion, and propose restricted
execution instead.  There, you can use Bastions -- which have problems
of their own, but you do get total control.

> You're probably still unconvinced <wink>, but are you dead-set against
> it?  I can try implementing __findattr__() as a pre-__getattr__ hook
> only.  Then we can live with the current __setattr__() restrictions
> and see what the examples look like in that situation.

I am dead-set against introducing a feature that I don't fully
understand.  Let's continue this discussion.

--Guido van Rossum (home page: http://www.python.org/~guido/)

[Python-Dev] PEP 231, __findattr__()

[Python-Dev] PEP 231, findattr()