[Python-ideas] Access to function objects

Mon Aug 8 15:07:46 CEST 2011

On Sun, Aug 7, 2011 at 7:56 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sun, Aug 7, 2011 at 11:07 PM, Guido van Rossum <guido at python.org> wrote:
>> On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> With a PEP 3135 closure style solution, the cell reference would be
>>> filled in at function definition time, so that part shouldn't be an
>>> issue.
>>
>> Yes, I was thinking of something like that (though honestly I'd
>> forgotten some of the details :-).
>
> I'd forgotten many of the details as well, but was tracking down some
> super() strangeness recently (to answer a question Michael Foord
> asked, IIRC) and had to look it up.
>
>> IMO there is no doubt that if __function__ were to exist it should
>> reference the innermost function, i.e. the thing that was created by
>> the 'def' statement before any decorators were applied.
>
> Yeah, I'd mostly realised that by the time I finished writing by last
> message, but figured I'd record the train of thought that got me
> there.
>
>>> Reference by name lazily accesses the outermost one, but doesn't care
>>> how the decorators are applied (i.e. as part of the def statement or
>>> via post decoration).
>>
>> What do you mean here by lazily?
>
> Just the fact that the reference isn't resolved until the function
> executes rather than being resolved when it gets defined.
>
>>> A __class__ style cell reference to the result
>>> of the 'def' statement would behave differently in the post decoration
>>> case.
>>
>> Oh you were thinking of making it reference the result after
>> decoration? Maybe I know too much about the implementation, but I
>> would find that highly confusing. Do you even have a use case for
>> that? If so, I think it should be a separate name, e.g.
>> __decorated_function__.
>
> The only reason I was thinking that way is that currently, if you do
> something like [1]:
>
> @lru_cache()
> def fib(n):
>    if n < 2:
>        return n
>    return fib(n-1) + fib(n-2)
>
> then, at call time, 'fib' will resolve to the caching wrapper rather
> than to the undecorated function. Using a reference to the undecorated
> function instead (as would have to happen for a sane implementation of
> __func__) would be actively harmful since the recursive calls would
> bypass the cache unless the lru_cache decorator took steps to change
> the way the reference evolved:
>
> @lru_cache()
> def fib(n):
>    if n < 2:
>        return n
>    return __func__(n-1) + __func__(n-2) # Not the same, unless lru_cache adjusts the reference

How would the the reference be adjusted?

> This semantic mismatch has actually shifted my opinion from +0 to -1
> on the idea. Relying on normal name lookup can be occasionally
> inconvenient, but it is at least clear what we're referring to. The
> existence of wrapper functions means that "this function" isn't as
> clear and unambiguous a phrase as it first seems.

To me it just means that __func__ will remain esoteric, which is just
fine with me. I wouldn't be surprised if there were use cases where it
was *desirable* to have a way (from the inside) to access the
undecorated function (somewhat similar to the thing with modules
below).

Also I really don't want the semantics of decorators to depart from
the original "define the function, then apply this to it" thing. And I
don't want to have to think about the possibility of __func__ being
overridden by the wrapping decorator either (or by anything else).

> (I think the reason we get away with it in the PEP 3135 case is that
> 'class wrappers' typically aren't handled via class decorators but via
> metaclasses, which do a better job of playing nicely with the implicit
> closure created to handle super() and __class__)

We didn't have class decorators then did we? Anyway I'm not sure what
the semantics are, but I hope they will be such that __class__
references the undecorated, original class object used when the method
was being defined. (If the class statement is executed repeatedly the
__class__ should always refer to the "real" class actually involved in
the method call.)

>>> While referencing the innermost function would likely be wrong in any
>>> case involving function attributes, having the function in a valid
>>> state during decoration will likely mandate filling in the cell
>>> reference before invoking any decorators. Perhaps the best solution
>>> would be to syntactically reference the innermost function, but
>>> provide a clean way in functools to shift the cell reference to a
>>> different function (with functools.wraps doing that automatically).
>>
>> Hm, making it dynamic sounds wrong. I think it makes more sense to
>> just share the attribute dict (which is easily done through assignment
>> to the wrapping function's __dict__).
>
> Huh, I hadn't even thought of that as a potential alternative to the
> update() based approach currently used in functools.wraps (I had to
> jump into the interactive interpreter to confirm that functions really
> do let you swap out their instance dict).

Me too. :-)

But I did remember that we might have made it that way, possibly for
this very use case.

> It's interesting that, once again, the status quo deals with this
> according to ordinary name resolution rules: any wrapping of the
> function will be ignored, *unless* we store the wrapper back into the
> original location so the name resolution in the function body will see
> it.

This makes sense because it builds complex functionality out of
simpler building blocks. Combining two things together doesn't add any
extra magic -- it's the building blocks themselves that add the magic.

> Since the idea of implicitly sharing state between currently
> independent wrapper functions scares me, this strikes me as another
> reason to switch to '-1'.

I'm still wavering between -0 and +0; I see some merit but I think the
high hopes of some folks for __func__ are unwarranted. Using the same
cell-based mechanism as used for __class__ may or may not be the right
implementation but I don't think that additional hacks based on
mutating that cell should be considered. So it would really be a wash
how it was done (at call time or at func def time). Are you aware of
anything that mutates the __class__ cell? It would seem pretty tricky
to do.

FWIW I don't think I want __func__ to be available at all times, like
someone (the OP?) mentioned. That seems an unnecessary slowdown of
every call / increase of every frame.

>>> This does seem like an area ripe for subtle decoration related bugs
>>> though, especially by contrast with lazy name based lookup.
>>
>> TBH, personally I am in most cases unhappy with the aggressive copying
>> of docstring and other metadata from the wrapped function to the
>> wrapper function, and wish the idiom had never been invented.
>
> IIRC, I was the one who actually committed the stdlib blessing of the
> idiom in the form of 'functools.wraps'. It was definitely a hack to
> deal with the increasing prevalence of wrapper functions as decorators
> became more popular - naive introspection was giving too many wrong
> answers and tweaking the recommended wrapping process so that
> 'f.__doc__' would work again seemed like a better option than defining
> a complex introspection protocol to handle wrapped functions.

I guess you rely more on interactive features like help() whereas I
rely more on browsing the source code. :-)

> I still think it was a reasonable way forward (and better than leaving
> things as they were), but it's definitely an approach with quite a few
> flaws.

You are forgiven. :-)

>>> While this may sound a little hypocritical coming from the author of
>>> PEPs 366 and 395, I'm wary of adding new implicit module globals for
>>> problems with relatively simple and robust alternatives. In this case,
>>> it's fairly easy to get access to the current module using the idiom
>>> Guido quoted:
>>>
>>>    import sys
>>>    _this = sys.modules[__name__]
>>>
>>> (or using dict-style access on globals())
>>
>> Yeah, well, in most cases I find having to reference sys.modules a
>> distraction and an unwarranted jump into the implementation. It may
>> not even work: there are some recipes that replace
>> sys.modules[__name__] with some wrapper object. If __this_module__
>> existed it would of course refer to the "real" module object involved.
>
> Some invocations of runpy.run_module also cause the 'sys.modules'
> based idioms to fail, so there may be a case to be made for this one.
> I suspect some folks would use it to avoid global declarations as well
> (i.e. by just writing '__module__.x = y').

+1. But what to call it? __module__ is a string in other places.

> It might cause the cyclic GC some grief, though,so the implementation
> consequences would need to be investigated if someone wanted to pursue
> it.

Modules are already involved in much cyclical GC grief, and most have
an infinite lifetime anyway (sys.modules keeps them alive). I doubt it
will get any worse.

> Cheers,
> Nick.
>
> [1] http://docs.python.org/dev/library/functools.html#functools.lru_cache

-- 
--Guido van Rossum (python.org/~guido)