generator expressions: performance anomaly?

Bengt Richter bokr at oz.net
Tue Jan 18 19:55:24 EST 2005


On Tue, 18 Jan 2005 15:29:06 +0100, "Diez B. Roggisch" <deetsNOSPAM at web.de> wrote:

>> I don't see how generating byte code for a = 9; when seeing the
>> expression a = 3 + 6, would be a problem for non-functional
>> languages.
>
>Most probably. But I don't see much code of that type that it would be worth
>optimizing for, either. The cost for re-evaluation such an expression
>doesn't really account for any performance problems you hit - in python, of
>course.
>See this:
>
>deets at kumquat:/usr/lib/python2.3$ python timeit.py -c "[4*5 for i in
>xrange(10000)]"
>100 loops, best of 3: 5.5e+03 usec per loop
>deets at kumquat:/usr/lib/python2.3$ python timeit.py -c "[20 for i in
>xrange(10000)]"
>100 loops, best of 3: 4.3e+03 usec per loop
>
>
>Now of course the longer the expressions get, the more time it costs - but
>how many long arithmetical expression of constant evaluation value do you
>have?
>
>> 
>> I agree that things like [time.time() for i in xrange(10)] shouldn't
>> be pregenerated and that the problem is more complicated as I thought.
>> 
>> But during compilation the compilor could do an anlysis of the code
>> do determine whether there are side effects or not. If the compilor
>> then would store a code in the byte code for functions that are
>> guaranteed side-effect free and only pregenerated objects generated
>> by expressions with no side-effect, some common subexpression
>> elimination could be done even in a non-functional language.
>
>This analysis would only be possible for the most primitive of examples, 
>the reason beeing that due to the dynamic features syntactically equivalent
>expressions can have totally different semantics. So its not really worth
>the effort.

IOM that doesn't mean there might not be an alternative interesting angle. E.g.,
what if we had a yield_constant (instant bf here ;-) as well as a yield.
IOW an explicit promise that the generator would always yield the same thing
for the same input args. This could potentially allow tuple(x for x in foo(1234))
to be pre-calculated, or marked for sticky internal caching or such, if repeat
use benefit could not be determined.  I guess we can write memoizing decorators now,
so how would you e.g., spell memoizing a generator expression? No end to probably
impractical ideas ;-)

IMO it's more a matter of RO[D]I (return on [development] investment) and
restraints on bloat than whether optimizations and other ideas are feasible.

I think practicality vs purity drives actual implementation, but purity issues
drive a lot of interesting, if not too practical, discussions. Of course, if someone
has the energy to become champions of their ideas, and actually implement them
for testing and evaluation, more power to them. I don't think we should beat up
on Antoon for his interest in exploring theoretical possibilities. Sometimes
a spark may fly off and provide some useful illumination. OTOH, interesting is
in the eye of the beholder ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list