iteration without storing a variable

Josh Dukes josh.dukes at microvu.com
Wed Mar 25 15:05:38 EDT 2009


So The metasploit framework was suffering from some performance issues
which they fixed. http://www.metasploit.com/blog/

I was interested in comparing this to python. Language comparisons are
not generally very useful for a number of reasons, but some might find
this interesting. 

Clearly the python operator += is way faster than ruby's:

$ time python -c 'a = "A";
for r in xrange(100000): a += "A" '

real	0m0.109s
user	0m0.100s
sys	0m0.010s

$ time ruby -e 'a = "A"; 100000.times { a += "A" }'

real	0m3.113s
user	0m3.110s
sys	0m0.010s


one more zero and the difference is magnified even more:

$ time python -c 'a = "A";     
for r in xrange(1000000): a += "A" '

real	0m1.208s
user	0m0.940s
sys	0m0.270s

$ time ruby -e 'a = "A"; 1000000.times { a += "A" }'

^C

I wasn't patient enough to wait more than 30 seconds for ruby
to finish.....
If you use a python list instead of a string (and join at the end) you
get even better performance:

$ time python -c 'a = ["A"];   
for r in xrange(1000000): a.append("A") 
"".join(a)'

real	0m0.889s
user	0m0.870s
sys	0m0.020s

This seems to compare closely to the improved ruby code: 

$ time ruby -e 'a = "A"; 1000000.times { a << "A" }'

real	0m0.920s
user	0m0.920s
sys	0m0.000s

Interestingly enough this is even with the slight performance hit for initializing python vs. initializing ruby:

$ time ruby -e ''

real	0m0.008s
user	0m0.010s
sys	0m0.000s

$ time python -c ''

real	0m0.023s
user	0m0.010s
sys	0m0.020s

Obviously speed isn't everything and I choose python because it's more
clean and readable, but it is interesting. The main question this raises
for me is, "is it possible to more closely replicate the ruby code and
do a loop without variable assignment?" 

Of course that was quickly followed by, "does the variable assignment
on each iteration really cause any kind of performance hit?" In
reference to the difference between ruby and python, it seems the
answer is no. I'm guessing that this is because the iteration number
needs to be stored anyway during a loop, so making it availble has
essentially zero cost (sound right?).

$ time python -c 'for r in xrange(1000000): pass'

real	0m0.210s
user	0m0.210s
sys	0m0.000s

$ time ruby -e '1000000.times { }'

real	0m0.259s
user	0m0.250s
sys	0m0.000s


Anyone see anything I missed? Any additional info? Anyone get different
results? 

-- 

Josh Dukes
MicroVu IT Department



More information about the Python-list mailing list