[Python-Dev] Release of astoptimizer 0.3
Victor Stinner
victor.stinner at gmail.com
Tue Sep 11 23:47:06 CEST 2012
>> * Call builtin functions if arguments are constants. Examples:
>>
>> - len("abc") => 3
>> - ord("A") => 65
>
> Does this mean transformation print("A") => None and output at compile time?
Only a subset of builtin functions are called at compile time, and not
with any constant value. See astoptimizer/config_builtin_funcs.py.
print() has other optimizations, but it is not called at compile time.
>> * Loop: replace range() with xrange() on Python 2,
>
> range(0, 10**10, 10**9)
Oh. I didn't know this difference between range and xrange. I should
add more checks on arguments.
>> * Evaluate unary and binary operators, subscript and comparaison if all
>> arguments are constants. Examples:
>>
>> - 1 + 2 * 3 => 7
>
> Does this mean 1/0 raises exception at compile time?
Such trap are handled carefully. There are many checks to decide if an
operator can be optimized or not.
Example: "\U0010ffff"[0] is not optimized by default.
>> * Remove dead code. Examples:
>>
>> - def f(): return 1; return 2 => def f(): return 1
>> - if DEBUG: print("debug") => pass with DEBUG declared as False
>> - while 0: ... => pass
>
> As Nick said, it could change the semantics, if there were local variable
> assignment or yield in removed code.
yield is handled correctly, but local variable is a new problem.
I should find something to detect local variables. I already have code
to handle variable, but it not fully enabled because there are known
bugs.
> It would be good if the optimizer checked if the built-in names overridden
> in this module or function and disabled this dangerous optimization in such
> case.
Simple cases like "len=chr" or "from math import floor as int" are now
handled correctly in the new version 0.3.1.
> set([1, 2, 3]) => {1, 2, 3}
I don't see how to create literal set or frozen in AST.
set([...]) may be converted to set((...)) at least.
It would be nice to have a new ast.Lit() in Python 3.4: the idea was
already proposed (with a patch, 0_ast.patch) in the issue #11549.
> set([x for ...]) => {x for ...}
> dict([(k, v) for ...]) => {k: v for ...}
> dict((k, v) for ...) => {k: v for ...}
> ''.join([s for ...]) => ''.join(s for ...)
> a.extend([s for ...]) => a.extend(s for ...)
These optimizations look correct.
> (f(x) for x in a) => map(f, a)
> (x.y for x in a) => map(operator.attrgetter('y'), a)
> (x[0] for x in a) => map(operator.itemgetter(0), a)
> (2 * x for x in a) => map((2).__mul__, a)
> (x in b for x in a) => map(b.__contains__, a)
> map(lambda x: x.strip(), a) => (x.strip() for x in a)
Is it faster? :-)
> x in ['i', 'em', 'cite'] => x in {'i', 'em', 'cite'}
A list can contain non-hashable objects, whereas a set can not.
> x == 'i' or x == 'em' or x == 'cite'] => x in {'i', 'em', 'cite'}
You need to know the type of x. Depending on the type, x.__eq__ and
x.__contains__ may be completly different.
> a = []; for ...: a.append(x) => a = [x for ...]
> a = (); for ...: a.add(x) => a = {x for ...}
> a = {}; for ...: a[k] = v => a = {k: v for ...}
Yeah, the first example is already in the TODO list :-) But it's not
trivial. I prefer to start with unrolling loops :-)
> for ...: f.write(...) => __fwrite = f.write; for ...: __fwrite(...)
f.write lookup cannot be optimized.
> x = x + 1 => x += 1
> x = x + ' ' => x += ' '
I don't know if these optimizations are safe.
> x = x + [y] => x.append(y)
> x = x + [y, z] => x.extend([y, z])
It behaves differently if x is not a list, but str for example.
> 'x=%s' % repr(x) => 'x=%a' % (x,)
I don't understand this one.
> 'x=%s' % x + s => 'x=%s%s' % (x, s)
> x = x + ', [%s]' % y => x = '%s, [%s]' % (x, y)
Doesn't work if s type is not str.
> range(0, x) => range(x)
Is it faster?
> for i in range(len(a)): x = a[i] ... => for i, x in enumerate(a): ...
> i = 0; for x in a: i += 1 ... => for i, x in enumerate(a, 1): ...
> i = 1; for x in a: ... i += 1 => for i, x in enumerate(a, 1): ...
> s = 0; for ...: if ...: s += 1 => s = sum(1 for ... if ...)
These optimizations look interesting.
> while True: s = f.readline(); if not s: break; ... => for s in f: ...
Too much assumptions on f type.
> def f(x): ... len() ... => def f(x, __len = len): ... __len() ...
Ah yes, I read somewhere than locals are faster than builtins. But
this specific example is wrong, f(1, 2) must fail. It is possible to
write something like:
def create_f():
len = len
def f(x):
... len ...
return f
f = create_f(); del f
Victor
More information about the Python-Dev
mailing list