On 3 January 2016 at 04:56, Andrew Barnert via Python-ideas
On Jan 2, 2016, at 10:14, u8y7541 The Awesome Person
wrote: In most decorator tutorials, it's taught using functions inside functions. Isn't this inefficient because every time the decorator is called, you're redefining the function, which takes up much more memory?
No.
First, most decorators are only called once. For example:
@lru_cache(maxsize=None) def fib(n) if n < 2: return n return fib(n-1) + fib(n-2)
The lru_cache function gets called once, and creates and returns a decorator function. That decorator function is passed fib, and creates and returns a wrapper function. That wrapper function is what gets stored in the globals as fib. It may then get called a zillion times, but it doesn't create or call any new function.
So, why should you care whether lru_cache is implemented with a function or a class? You're talking about a difference of a few dozen bytes, once in the entire lifetime of your program.
We need to make a slight terminology clarification here, as the answer to Surya's question changes depend on whether we're talking about implementing wrapper functions inside decorators (like the "_lru_cache_wrapper" that lru_cache wraps around the passed in callable), or about implementing decorators inside decorator factories (like the transient "decorating_function" that lru_cache uses to apply the wrapper to the function being defined). Most of the time that distinction isn't important, so folks use the more informal approach of using "decorator" to refer to both decorators and decorator factories, but this is a situation where the difference matters. Every decorator factor does roughly the same thing: when called, it produces a new instance of a callable type which accepts a single function as its sole argument. From the perspective of the *user* of the decorator factory, it doesn't matter whether internally that's handled using a def statement, a lambda expression, functools.partial, instantiating a class that defines a custom __call__ method, or some other technique. It's also rare for decorator factories to be invoked in code that's a performance bottleneck, so it's generally more important to optimise for readability and maintainability when writing them than it is to optimise for speed. The wrapper functions themselves, though, exist in a one:one correspondence with the functions they're applied to - when you apply functools.lru_cache to a function, the transient decorator produced by the decorator factory only lasts as long as the execution of the function definition, but the wrapper function lasts for as long as the wrapped function does, and gets invoked every time that function is called (and if a function is performance critical enough for the results to be worth caching, then it's likely performance critical enough to be thinking about micro-optimisations). As such, from a micro-optimisation perspective, it's reasonable to want to know the answers to: * Which is faster, defining a new function object, or instantiating an existing class? * Which is faster, calling a function object that accepts a single parameter, or calling a class with a custom __call__ method? * Which uses more memory, defining a new function object, or instantiating an existing class? The answers to these questions can technically vary by implementation, but in practice, CPython's likely to be representative of their *relative* performance for any given implementation, so we can use it to check whether or not our intuitions about relative speed and memory consumption are correct. For the first question then, here are the numbers I get locally for CPython 3.4: $ python3 -m timeit "def f(): pass" 10000000 loops, best of 3: 0.0744 usec per loop $ python3 -m timeit -s "class C: pass" "c = C()" 10000000 loops, best of 3: 0.113 usec per loop The trick here is to realise that *at runtime*, a def statement is really just instantiating a new instance of types.FunctionType - most of the heavy lifting has already been done at compile time. The reason it manages to be faster than typical class instantiation is because we get to use customised bytecode operating on constant values rather than having to look the class up by name and making a standard function call:
dis.dis("def f(): pass") 1 0 LOAD_CONST 0 (
) 3 LOAD_CONST 1 ('f') 6 MAKE_FUNCTION 0 9 STORE_NAME 0 (f) 12 LOAD_CONST 2 (None) 15 RETURN_VALUE dis.dis("c = C()") 1 0 LOAD_NAME 0 (C) 3 CALL_FUNCTION 0 (0 positional, 0 keyword pair) 6 STORE_NAME 1 (c) 9 LOAD_CONST 0 (None) 12 RETURN_VALUE
For the second question: $ python3 -m timeit -s "def f(arg): pass" "f(None)" 10000000 loops, best of 3: 0.111 usec per loop [ncoghlan@thechalk ~]$ python3 -m timeit -s "class C:" -s " def __call__(self, arg): pass" -s "c = C()" "c(None)" 1000000 loops, best of 3: 0.232 usec per loop Again, we see that the native function outperforms the class with a custom __call__ method. There's no difference in the bytecode this time, but rather a difference in what happens inside the CALL_FUNCTION opcode: for the second case, we first have to retrieve the bound c.__call__() method, and then call *that* as c.__call__(None), which in turn internally calls C.__call__(c, None), while for the native function case we get to skip straight to running the called function. The speed difference can be significantly reduced (but not entirely eliminated), by caching the bound method during setup: $ python3 -m timeit -s "class C:" -s " def __call__(self, arg): pass" -s "c_call = C().__call__" "c_call(None)" 10000000 loops, best of 3: 0.115 usec per loop Finally, we get to the question of relative size: are function instances larger or smaller than your typical class instance? Again, we don't have to guess, we can use the interpreter to experiment and check our assumptions: >>> import sys >>> def f(): pass ... >>> sys.getsizeof(f) 136 >>> class C(): pass ... >>> sys.getsizeof(C()) 56 That's a potentially noticeable difference if we're applying the wrapper often enough - the native function is 80 bytes larger than an empty standard class instance. Looking at the available data attributes on f, we can see the likely causes of the difference: >>> set(dir(f)) - set(dir(C())) {'__code__', '__defaults__', '__name__', '__closure__', '__get__', '__kwdefaults__', '__qualname__', '__annotations__', '__globals__', '__call__'} There are 10 additional attributes there, although 2 of them (__get__ and __call__) relate to methods our native function has defined, but the empty class doesn't. The other 8 represent additional pieces of data stored (or potentially stored) per function, that we don't store for a typical class instance. However, we also need to account for the overhead of defining a new class object, and that's a non-trivial amount of memory when we're talking about a size difference of only 80 bytes per wrapped function: >>> sys.getsizeof(C) 976 That means if a wrapper function is only used a few times in any given run of the program, then native functions will be faster *and* use less memory (at least on CPython). If the wrapper is used more often than that, then native functions will still be the fastest option, but not the lowest memory option. Furthermore, if we decide to cache the bound __call__ method to reduce the speed impact of using a custom __call__ method, we give up most of the memory gains: >>> sys.getsizeof(C().__call__) 64 This all suggests that if your application is severely memory constrained (e.g. it's running on an embedded interpreter like MicroPython), then it *might* make sense to incur the extra complexity of using classes with a custom __call__ method to define wrapper functions, over just using a nested function. For more typical cases though, the difference is going to disappear into the noise, so you're likely to be better off defaulting to using nested function definitions, and only switching to the class based version in cases where it's more readable and maintainable (and in those cases considering whether or not it might make sense to return the bound __call__ method from the decorator, rather than the callable itself). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia