[Python-ideas] Access to function objects

Tue Aug 9 01:00:17 CEST 2011

On Mon, Aug 8, 2011 at 11:07 PM, Guido van Rossum <guido at python.org> wrote:
> On Sun, Aug 7, 2011 at 7:56 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> then, at call time, 'fib' will resolve to the caching wrapper rather
>> than to the undecorated function. Using a reference to the undecorated
>> function instead (as would have to happen for a sane implementation of
>> __func__) would be actively harmful since the recursive calls would
>> bypass the cache unless the lru_cache decorator took steps to change
>> the way the reference evolved:
>>
>> @lru_cache()
>> def fib(n):
>>    if n < 2:
>>        return n
>>    return __func__(n-1) + __func__(n-2) # Not the same, unless lru_cache adjusts the reference
>
> How would the the reference be adjusted?

I was thinking of Michael's blog post about modifying cell contents
[1], but I had forgotten that the write operation required mucking
about with ctypes (since the cell_contents attribute of the cell is
read only at the Python level). That's actually a good thing, since it
means a cell based __func__ reference would consistently refer to the
innermost function, ignoring any wrapper functions.

[1] http://www.voidspace.org.uk/python/weblog/arch_d7_2011_05_28.shtml#e1214

>> This semantic mismatch has actually shifted my opinion from +0 to -1
>> on the idea. Relying on normal name lookup can be occasionally
>> inconvenient, but it is at least clear what we're referring to. The
>> existence of wrapper functions means that "this function" isn't as
>> clear and unambiguous a phrase as it first seems.
>
> To me it just means that __func__ will remain esoteric, which is just
> fine with me. I wouldn't be surprised if there were use cases where it
> was *desirable* to have a way (from the inside) to access the
> undecorated function (somewhat similar to the thing with modules
> below).

Yeah, now that I remember that you have to use the C API in order to
monkey with the contents of a cell reference, I'm significantly
happier with the idea that __func__ could be given solid 'always
refers to the innermost unwrapped function definition' semantics.
Referring to the function by name would remain the way to access the
potentially wrapped version that is stored in the containing
namespace.

That's enough to get me back to -0. The reason I remain slightly
negative is that I'd like to see some concrete use cases where
ignoring wrapper functions is the right thing to do - every case that
comes to mind for me is like the lru_cache() Fibonacci example, where
bypassing the wrapper functions is precisely the *wrong* thing to do.

>> (I think the reason we get away with it in the PEP 3135 case is that
>> 'class wrappers' typically aren't handled via class decorators but via
>> metaclasses, which do a better job of playing nicely with the implicit
>> closure created to handle super() and __class__)
>
> We didn't have class decorators then did we? Anyway I'm not sure what
> the semantics are, but I hope they will be such that __class__
> references the undecorated, original class object used when the method
> was being defined. (If the class statement is executed repeatedly the
> __class__ should always refer to the "real" class actually involved in
> the method call.)

Yeah, __class__ always refers to the original class object, as created
by calling the metaclass. The idiom seems to be that people don't use
class decorators to wrap classes anyway, as metaclasses are a better
tool for that kind of thing - decorators are more used for things like
registration or attribute modifications.

>> Since the idea of implicitly sharing state between currently
>> independent wrapper functions scares me, this strikes me as another
>> reason to switch to '-1'.
>
> I'm still wavering between -0 and +0; I see some merit but I think the
> high hopes of some folks for __func__ are unwarranted. Using the same
> cell-based mechanism as used for __class__ may or may not be the right
> implementation but I don't think that additional hacks based on
> mutating that cell should be considered. So it would really be a wash
> how it was done (at call time or at func def time). Are you aware of
> anything that mutates the __class__ cell? It would seem pretty tricky
> to do.

No, I was misremembering how Michael's cell content modification trick
worked, and that was throwing off my opinion of how easy it was to
mess with the cell contents. Once people start using ctypes to access
the C API all bets are off anyway.

> FWIW I don't think I want __func__ to be available at all times, like
> someone (the OP?) mentioned. That seems an unnecessary slowdown of
> every call / increase of every frame.

Yeah, I think at least that part of the PEP 3135 approach should be
copied, even if the implementation ended up being different.

So the short version of my current opinion would be:

__func__: -0
- this discussion has pretty much sorted out what the semantics of
such a reference would be
- we know at least one way to implement it that works (cell based,
modelled on PEP 3135's __class__ reference)
- lacking concrete use cases where it is demonstrably superior to
reference by name lookup in the containing scope (given that many use
cases are forced into the use of name lookup in order to refer to a
wrapped version of the function)

__this_module__: +0
- the sys.modules[__name__] approach is obscure and distracting when
reading code
- there are cases where that approach is unreliable (e.g. involving
runpy.run_module with sys module alteration disabled)
- obvious naming (i.e. __module__) is problematic, since class and
function __module__ attributes are strings
- will need to be careful to avoid creating uncollectable garbage due
to the cyclic reference between the module and its global namespace
(but shouldn't be any worse than the cycle created by any module that
imports the sys module)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia