[Python-ideas] Access to function objects

Nick Coghlan ncoghlan at gmail.com
Mon Aug 8 01:56:32 CEST 2011


On Sun, Aug 7, 2011 at 11:07 PM, Guido van Rossum <guido at python.org> wrote:
> On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> With a PEP 3135 closure style solution, the cell reference would be
>> filled in at function definition time, so that part shouldn't be an
>> issue.
>
> Yes, I was thinking of something like that (though honestly I'd
> forgotten some of the details :-).

I'd forgotten many of the details as well, but was tracking down some
super() strangeness recently (to answer a question Michael Foord
asked, IIRC) and had to look it up.

> IMO there is no doubt that if __function__ were to exist it should
> reference the innermost function, i.e. the thing that was created by
> the 'def' statement before any decorators were applied.

Yeah, I'd mostly realised that by the time I finished writing by last
message, but figured I'd record the train of thought that got me
there.

>> Reference by name lazily accesses the outermost one, but doesn't care
>> how the decorators are applied (i.e. as part of the def statement or
>> via post decoration).
>
> What do you mean here by lazily?

Just the fact that the reference isn't resolved until the function
executes rather than being resolved when it gets defined.

>> A __class__ style cell reference to the result
>> of the 'def' statement would behave differently in the post decoration
>> case.
>
> Oh you were thinking of making it reference the result after
> decoration? Maybe I know too much about the implementation, but I
> would find that highly confusing. Do you even have a use case for
> that? If so, I think it should be a separate name, e.g.
> __decorated_function__.

The only reason I was thinking that way is that currently, if you do
something like [1]:

@lru_cache()
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

then, at call time, 'fib' will resolve to the caching wrapper rather
than to the undecorated function. Using a reference to the undecorated
function instead (as would have to happen for a sane implementation of
__func__) would be actively harmful since the recursive calls would
bypass the cache unless the lru_cache decorator took steps to change
the way the reference evolved:

@lru_cache()
def fib(n):
    if n < 2:
        return n
    return __func__(n-1) + __func__(n-2) # Not the same, unless
lru_cache adjusts the reference

This semantic mismatch has actually shifted my opinion from +0 to -1
on the idea. Relying on normal name lookup can be occasionally
inconvenient, but it is at least clear what we're referring to. The
existence of wrapper functions means that "this function" isn't as
clear and unambiguous a phrase as it first seems.

(I think the reason we get away with it in the PEP 3135 case is that
'class wrappers' typically aren't handled via class decorators but via
metaclasses, which do a better job of playing nicely with the implicit
closure created to handle super() and __class__)

>> While referencing the innermost function would likely be wrong in any
>> case involving function attributes, having the function in a valid
>> state during decoration will likely mandate filling in the cell
>> reference before invoking any decorators. Perhaps the best solution
>> would be to syntactically reference the innermost function, but
>> provide a clean way in functools to shift the cell reference to a
>> different function (with functools.wraps doing that automatically).
>
> Hm, making it dynamic sounds wrong. I think it makes more sense to
> just share the attribute dict (which is easily done through assignment
> to the wrapping function's __dict__).

Huh, I hadn't even thought of that as a potential alternative to the
update() based approach currently used in functools.wraps (I had to
jump into the interactive interpreter to confirm that functions really
do let you swap out their instance dict).

It's interesting that, once again, the status quo deals with this
according to ordinary name resolution rules: any wrapping of the
function will be ignored, *unless* we store the wrapper back into the
original location so the name resolution in the function body will see
it.

Since the idea of implicitly sharing state between currently
independent wrapper functions scares me, this strikes me as another
reason to switch to '-1'.

>> This does seem like an area ripe for subtle decoration related bugs
>> though, especially by contrast with lazy name based lookup.
>
> TBH, personally I am in most cases unhappy with the aggressive copying
> of docstring and other metadata from the wrapped function to the
> wrapper function, and wish the idiom had never been invented.

IIRC, I was the one who actually committed the stdlib blessing of the
idiom in the form of 'functools.wraps'. It was definitely a hack to
deal with the increasing prevalence of wrapper functions as decorators
became more popular - naive introspection was giving too many wrong
answers and tweaking the recommended wrapping process so that
'f.__doc__' would work again seemed like a better option than defining
a complex introspection protocol to handle wrapped functions.

I still think it was a reasonable way forward (and better than leaving
things as they were), but it's definitely an approach with quite a few
flaws.

>> While this may sound a little hypocritical coming from the author of
>> PEPs 366 and 395, I'm wary of adding new implicit module globals for
>> problems with relatively simple and robust alternatives. In this case,
>> it's fairly easy to get access to the current module using the idiom
>> Guido quoted:
>>
>>    import sys
>>    _this = sys.modules[__name__]
>>
>> (or using dict-style access on globals())
>
> Yeah, well, in most cases I find having to reference sys.modules a
> distraction and an unwarranted jump into the implementation. It may
> not even work: there are some recipes that replace
> sys.modules[__name__] with some wrapper object. If __this_module__
> existed it would of course refer to the "real" module object involved.

Some invocations of runpy.run_module also cause the 'sys.modules'
based idioms to fail, so there may be a case to be made for this one.
I suspect some folks would use it to avoid global declarations as well
(i.e. by just writing '__module__.x = y').

It might cause the cyclic GC some grief, though,so the implementation
consequences would need to be investigated if someone wanted to pursue
it.

Cheers,
Nick.

[1] http://docs.python.org/dev/library/functools.html#functools.lru_cache

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list