I could use some help making this Python code run faster using only Python code.
George Sakkis
george.sakkis at gmail.com
Fri Sep 21 02:46:05 EDT 2007
On Sep 20, 7:13 pm, "mensana... at aol.com" <mensana... at aol.com> wrote:
> On Sep 20, 5:46 pm, Paul Hankin <paul.han... at gmail.com> wrote:
>
>
>
> > On Sep 20, 10:59 pm, Python Maniac <raych... at hotmail.com> wrote:
>
> > > I am new to Python however I would like some feedback from those who
> > > know more about Python than I do at this time.
>
> > > def scrambleLine(line):
> > > s = ''
> > > for c in line:
> > > s += chr(ord(c) | 0x80)
> > > return s
>
> > > def descrambleLine(line):
> > > s = ''
> > > for c in line:
> > > s += chr(ord(c) & 0x7f)
> > > return s
> > > ...
>
> > Well, scrambleLine will remove line-endings, so when you're
> > descrambling
> > you'll be processing the entire file at once. This is particularly bad
> > because of the way your functions work, adding a character at a time
> > to
> > s.
>
> > Probably your easiest bet is to iterate over the file using read(N)
> > for some small N rather than doing a line at a time. Something like:
>
> > process_bytes = (descrambleLine, scrambleLine)[action]
> > while 1:
> > r = f.read(16)
> > if not r: break
> > ff.write(process_bytes(r))
>
> > In general, rather than building strings by starting with an empty
> > string and repeatedly adding to it, you should use ''.join(...)
>
> > For instance...
> > def descrambleLine(line):
> > return ''.join(chr(ord(c) & 0x7f) for c in line)
>
> > def scrambleLine(line):
> > return ''.join(chr(ord(c) | 0x80) for c in line)
>
> > It's less code, more readable and faster!
>
> I would have thought that also from what I've heard here.
>
> def scrambleLine(line):
> s = ''
> for c in line:
> s += chr(ord(c) | 0x80)
> return s
>
> def scrambleLine1(line):
> return ''.join([chr(ord(c) | 0x80) for c in line])
>
> if __name__=='__main__':
> from timeit import Timer
> t = Timer("scrambleLine('abcdefghijklmnopqrstuvwxyz')", "from
> __main__ import scrambleLine")
> print t.timeit()
>
> ## scrambleLine
> ## 13.0013366039
> ## 12.9461998318
> ##
> ## scrambleLine1
> ## 14.4514098748
> ## 14.3594400695
>
> How come it's not? Then I noticed you don't have brackets in
> the join statement. So I tried without them and got
>
> ## 17.6010847978
> ## 17.6111472418
>
> Am I doing something wrong?
It has to do with the input string length; try multiplying it by 10 or
100. Below is a more complete benchmark; for largish strings, the imap
version is the fastest among those using the original algorithm. Of
course using a lookup table as Diez showed is even faster. FWIW, here
are some timings (Python 2.5, WinXP):
scramble: 1.818
scramble_listcomp: 1.492
scramble_gencomp: 1.535
scramble_map: 1.377
scramble_imap: 1.332
scramble_dict: 0.817
scramble_dict_map: 0.419
scramble_dict_imap: 0.410
And the benchmark script:
from itertools import imap
def scramble(line):
s = ''
for c in line:
s += chr(ord(c) | 0x80)
return s
def scramble_listcomp(line):
return ''.join([chr(ord(c) | 0x80) for c in line])
def scramble_gencomp(line):
return ''.join(chr(ord(c) | 0x80) for c in line)
def scramble_map(line):
return ''.join(map(chr, map(0x80.__or__, map(ord,line))))
def scramble_imap(line):
return ''.join(imap(chr, imap(0x80.__or__,imap(ord,line))))
scramble_table = dict((chr(i), chr(i | 0x80)) for i in xrange(255))
def scramble_dict(line):
s = ''
for c in line:
s += scramble_table[c]
return s
def scramble_dict_map(line):
return ''.join(map(scramble_table.__getitem__, line))
def scramble_dict_imap(line):
return ''.join(imap(scramble_table.__getitem__, line))
if __name__=='__main__':
funcs = [scramble, scramble_listcomp, scramble_gencomp,
scramble_map, scramble_imap,
scramble_dict, scramble_dict_map, scramble_dict_imap]
s = 'abcdefghijklmnopqrstuvwxyz' * 100
assert len(set(f(s) for f in funcs)) == 1
from timeit import Timer
setup = "import __main__; line = %r" % s
for name in (f.__name__ for f in funcs):
timer = Timer("__main__.%s(line)" % name, setup)
print '%s:\t%.3f' % (name, min(timer.repeat(3,1000)))
George
More information about the Python-list
mailing list