[Python-ideas] Specifying constants for functions

Victor Stinner victor.stinner at gmail.com
Tue Oct 27 22:48:41 EDT 2015


Hi,

2015-10-28 2:45 GMT+09:00 Serhiy Storchaka <storchaka at gmail.com>:
> There is known trick to optimize a function:
>
>     def foo(x, y=0, len=len, pack=struct.pack, maxsize=1<<BPF):
>         ...

Yeah, it can show a speedup on a micro-benchmark. Probably not a macro
benchmark. As it was said in other answers, this hack is also abused
for bad reasons.

This hack is mainly used in the stdlib to keep symbols alive during
Python shutdown to be able to cleanup properly objects. Just one
example from Lib/subprocess.py of Python 3.6: "def __del__(self,
_maxsize=sys.maxsize):". I guess that sys.maxsize symbol is removed or
set to None during Python shutdown. So depending on the order in which
modules are cleared (subprocess,sys or sys,subprocess), the __del__()
method may fail or may not fail without the "_maxsize=sys.maxsize"
hack.

I would appreciate a syntax to not change the function signature, even
if this hack is mostly used in destructors and destructors must *not*
be called explicitly.

> This feature is rather ideologically opposite to Victor's approach.

I disagree, it's not incompatible with my FAT Python project. In some
cases, we may still see speedup if you combine copying globals to
locals and using FAT Python optimizations. My idea is more to optimize
code without having to modify manually the code to optimize it.

Using FAT Python, you can implement an optimizer producing code like:
---
import builtins

def f(data):
    lengths = []
    for item in data:
        lengths.append(len(item))
    return lengths

def f_copy_globals(data, _len=len):
    lengths = []
    for item in data:
        lengths.append("fast: %s" % _len(item))   # add "fast" to
ensure that we call the "fast" function
    return lengths

i = f.specialize(f_copy_globals)
f.add_dict_guard(i, builtins.__dict__, 'len')
f.add_dict_guard(i, globals(), 'len')

# test specialized function with "fast" _len local symbol
data = ["abc", list(range(5))]
print(f(data))

# test with a mocked len() builtin function
builtins.len = lambda obj: 10
data = ["abc", list(range(5))]
print(f(data))
---

Output:
---
['fast: 3', 'fast: 5']
[10, 10]
---

This optimization doesn't respect strictly Python semantic because the
len() builtin function can be modified during two loop iterations in a
different Python thread. In some cases, it can be worth to optimize
the function and doesn't respect stricly the Python semantic. As it
was discussed in the FAT Python thread, depending on your use case,
you may or may not allow some optimizations.

Note: This example doesn't work with my current implementation of FAT
Python, because f() and f_copy_globals() don't have the same default
values for parameters. You can test with "def f(data, _len=len):". I
have to modify FAT Python to support this example. There is also a bug
if the specialized function uses a free variable, but not the original
function. Again, it should enhance FAT Python to support this case.

Victor


More information about the Python-ideas mailing list