[Python-ideas] Explicit variable capture list
M.-A. Lemburg
mal at egenix.com
Thu Jan 21 08:39:29 EST 2016
On 21.01.2016 14:19, Victor Stinner wrote:
> 2016-01-21 10:39 GMT+01:00 M.-A. Lemburg <mal at egenix.com>:
>> I ran performance tests on these optimization tricks (and
>> others) in 2014. See this talk:
>>
>> http://www.egenix.com/library/presentations/PyCon-UK-2014-When-performance-matters/
>> (slides 33ff.)
>
> Ah nice, thanks for the slides.
Forgot to mention the benchmarks I used:
https://github.com/egenix/when-performance-matters
>> The keyword trick doesn't really pay off in terms of added
>> performance vs. danger of introducing weird bugs.
>
> I ran a quick microbenchmark to measure the cost of LOAD_GLOBAL to
> load a global: call func("abc") with
>
> mylen = len
> def func(obj): return mylen(obj)
>
> Result:
>
> 117 ns: original bytecode (LOAD_GLOBAL)
> 109 ns: LOAD_CONST
> 116 ns: LOAD_CONST with guard
>
> LOAD_CONST avoids 1 dict lookup (globals) and reduces the runtime by 8
> ns: 7% faster. But the guard has a cost of 7 ns: we only win 1
> nanosecond. Not really interesting here.
>
> LOAD_CONST means that the LOAD_GLOBAL instruction has been replaced
> with a LOAD_CONST instruction. The guard checks if the frame globals
> and globals()['mylen'] didn't change.
>
>
> I ran a second microbenchmark on func("abc") to measure the cost
> LOAD_GLOBAL to load a builtin: call func("abc") with
>
> def func(obj): return len(obj)
>
> Result:
>
> 124 ns: original bytecode (LOAD_GLOBAL)
> 107 ns: LOAD_CONST
> 116 ns: LOAD_CONST with guard on builtins + globals
>
> LOAD_CONST avoids 2 dict lookup (globals, builtins) and reduces the
> runtime by 17 ns: 14% faster. But the guard has a cost of 9 ns: we win
> 8 nanosecond, 6% faster.
>
> Here is the guard is more complex: checks if the frame builtins, the
> frame globals, builtins.__dict__['len'] and globals()['len'] didn't
> change.
>
>
> If you avoid guards, it's always faster, but it changes the Python semantics.
>
> The speedup on such very small example is low. It's more interesting
> when the global or builtin variable is used in a loop: the speedup is
> multipled by the number of loop iterations.
Sure, but for those, you'd probably simply use the in-function
localization:
def f(seq):
z = 0
local_len = len
for x in seq:
if x:
z += local_len(x)
return z
This results in a LOAD_FAST inside the loop and is probably
the better way to speed things up.
>> A decorator could help with this (by transforming the byte
>> code and localizing the symbols), e.g.
>>
>> @localize(len)
>> def f(seq):
>> z = 0
>> for x in seq:
>> if x:
>> z += len(x)
>> return z
>
> FYI https://pypi.python.org/pypi/codetransformer has such decorator:
> @asconstants(len=len).
Interesting :-)
>> All that said, I don't really believe that this is a high
>> priority feature request. The gained performance win is
>> not all that great and only becomes relevant when used
>> in tight loops.
>
> Yeah, in the Python stdlib, the hack is only used for loops.
Right. The only advantage I'd see in having a keyword
to "configure" the behavior is that you could easily
apply the change to a whole module/function without having
to add explicit localizations everywhere.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Jan 21 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/
More information about the Python-ideas
mailing list