for in benchmark interested
Jeremy Hylton
jeremy at cnri.reston.va.us
Thu Apr 15 15:25:37 EDT 1999
The Python version would be faster if you used sys.stdin.read instead
of sys.stdin.readlines. I'm not sure why you need to split the input
into lines before you split it into words; it seems like an
unnecessary step.
The version below is 25% faster on my machine than your fastest Python
version. (And I'm not even an expert Python optimizer :-).
import sys
import string
def run():
dict={}
dict_get = dict.get
read = sys.stdin.read
string_split = string.split
while 1:
buf = read(500000)
if buf:
for key in string_split(buf):
dict[key] = dict_get(key, 0) + 1
else:
return dict
dict = run()
write = sys.stdout.write
for word in dict.keys():
write("%4d\t%s\n" % (dict[word], word))
Jeremy
More information about the Python-list
mailing list