code for Computer Language Shootout

Jacob Lee jelee2 at uiuc.edu
Wed Mar 16 01:32:26 EST 2005


On Tue, 15 Mar 2005 22:45:48 -0700, Steven Bethard wrote:

> # table as default argument value so you don't have to do
> # a global lookup each time it's used
> 
> def show(seq, table=string.maketrans('ACBDGHK\nMNSRUTWVY',
>                                       'TGVHCDM\nKNSYAAWBR')
>      seq = seq.upper().translate(table)[::-1]
>      # print string in slices of length 60
>      for i in range(0, len(seq), 60):
>          print seq[i:i+60]
> 
> def main():
>      seq = []
>      # alias methods to avoid repeated lookup
>      join = ''.join
>      append = seq.append
>      for line in sys.stdin:
>          # note only one "line[0]" by using "in" test
>          if line[0] in ';>':
>              # note no need to check if seq is empty; show now prints
>              # nothing for an empty string
>              show(join(seq))
>              print line,
>              del seq[:]
>          else:
>              append(line[:-1])
> 

Wow - that ran about 10 times faster on a 10MB input file. The significant
change was the for loop inside the show function: your method avoids the
increment and if statement and of course has 60x fewer iterations total.

> reversed() won't save you any memory -- you're already loading the 
> entire string into memory anyway.
> 
> 
> Interesting tidbit:
>      del seq[:]
> tests faster than
>      seq = []
> 
> $ python -m timeit -s "lst = range(1000)" "lst = []"
> 10000000 loops, best of 3: 0.159 usec per loop
> 
> $ python -m timeit -s "lst = range(1000)" "del lst[:]"
> 10000000 loops, best of 3: 0.134 usec per loop
> 
> It's probably the right way to go in this case anyway -- no need to 
> create a new empty list each time.

Fascinating - I hadn't thought about that.

Besides redoing that loop, the remaining optimizations produce less than a
10% speed-up; in particular, adding the function aliases increases the
number of lines of code (another benchmark in the shootout), and I would
imagine that the organizers don't really want that type of special
optimization (no developer is going to write that in their programs
unless they have really time-critical code, so this is the sort of hit
that Python really should take in the contest as penalty for being so
dynamically nifty ;)). So here's a tentative contest version of the code:

import sys
import string

def show(seq, table=string.maketrans('ACBDGHK\nMNSRUTWVYacbdghkmnsrutwvy',
                                     'TGVHCDM\nKNSYAAWBRTGVHCDMKNSYAAWBR')):
     seq = seq.translate(table)[::-1]
     for i in range(0, len(seq), 60):
         print seq[i:i+60]

def main():
     seq = []
     for line in sys.stdin:
         if line[0] in ';>':
             show(''.join(seq))
             print line,
             del seq[:]
         else:
             seq.append(line[:-1])
     show(''.join(seq))

main()

-- 
Jacob Lee
jelee2 at uiuc.edu | www.nearestneighbor.net




More information about the Python-list mailing list