[Python-Dev] Release of astoptimizer 0.3
Serhiy Storchaka
storchaka at gmail.com
Wed Sep 12 09:36:24 CEST 2012
On 12.09.12 00:47, Victor Stinner wrote:
>> set([x for ...]) => {x for ...}
>> dict([(k, v) for ...]) => {k: v for ...}
>> dict((k, v) for ...) => {k: v for ...}
>> ''.join([s for ...]) => ''.join(s for ...)
>> a.extend([s for ...]) => a.extend(s for ...)
>
> These optimizations look correct.
Actually generator can be slower list comprehension. Especially on
Python2. I think this is an opportunity to optimize the work with
generators.
>> (f(x) for x in a) => map(f, a)
>> (x.y for x in a) => map(operator.attrgetter('y'), a)
>> (x[0] for x in a) => map(operator.itemgetter(0), a)
>> (2 * x for x in a) => map((2).__mul__, a)
>> (x in b for x in a) => map(b.__contains__, a)
>> map(lambda x: x.strip(), a) => (x.strip() for x in a)
>
> Is it faster? :-)
Yes, significantly for large sequences. But this transformation is not
safe in general case. For short sequences possible regression (cost of
"map" name lookup and function call).
>> x in ['i', 'em', 'cite'] => x in {'i', 'em', 'cite'}
>
> A list can contain non-hashable objects, whereas a set can not.
Agree, it applicable if x is proven str. At least list can be replaced
by tuple.
>> x == 'i' or x == 'em' or x == 'cite'] => x in {'i', 'em', 'cite'}
>
> You need to know the type of x. Depending on the type, x.__eq__ and
> x.__contains__ may be completly different.
Then => x in ('i', 'em', 'cite') and move forward only if x obviously is
of the appropriate type.
>> for ...: f.write(...) => __fwrite = f.write; for ...: __fwrite(...)
>
> f.write lookup cannot be optimized.
Yes, it is a dangerous transformation and it is difficult to prove its
safety. But name lookup is one of the main brakes of Python.
>> x = x + 1 => x += 1
>> x = x + ' ' => x += ' '
>
> I don't know if these optimizations are safe.
It is safe if x is proven number or string. If x is local variable,
initialized by number/string and modified only by number/string.
Counters and string accumulators are commonly used.
>> 'x=%s' % repr(x) => 'x=%a' % (x,)
>
> I don't understand this one.
Sorry, it should be => 'x=%r' % (x,). And for more arguments: 'x[' +
repr(k) + ']=' + repr(v) + ';' => 'x[%r]=%r;' % (k, v). Same for str and
ascii.
It is not safe (repr can be shadowed).
>> 'x=%s' % x + s => 'x=%s%s' % (x, s)
>> x = x + ', [%s]' % y => x = '%s, [%s]' % (x, y)
>
> Doesn't work if s type is not str.
Yes, this is partially applicable. In many cases, s is a literal or the
newly formatted string.
>> range(0, x) => range(x)
>
> Is it faster?
Slightly.
>> while True: s = f.readline(); if not s: break; ... => for s in f: ...
>
> Too much assumptions on f type.
I personally would prefer a 2to3-like "modernizer" (as a separate
utility and as plugins for the IDEs), which would have found some
templates and offered replacing by a more modern, readable (and possibly
effective) variant. The decision on the applicability of the
transformation in the particular case remains for the human. For the
automatic optimizer remain only simple transformations which deteriorate
readability, and optimizations which cannot be expressed in the source code.
More information about the Python-Dev
mailing list