improved search algorithm
grahamdick77 at gmail.com
grahamdick77 at gmail.com
Mon May 4 14:15:13 EDT 2009
Hi
I have an excel file that is read into python (8000 rows)
from csv import reader, writer
incsv = reader(open(MY_FILE), dialect='excel')
keys = incsv.next()
There are mixed datatypes.
the last column contains a cumulative frequency running in order
0.0000 to 1.0000 for the 8000 rows
for a loop of 100,000 times I want to take a new random number each
time and find the row with the next biggest cumulative frequency value
Here's my current (pseudo)code:
for 1 to 100000
myRand = random.random()
for line in incsv:
if float(line[-1]) > myRand:
resline = []
for item in line:
try:
i = int(item)
except ValueError:
try:
i = float(item)
except ValueError:
i = item
resline.append(i)
#Here we construct a dict of pair values:
#{'ID':18,...
res = dict(zip(keys,resline))
break
else:
continue
#do some stuff with res
I'm scanning over each line of the csv and deciding which row to
select (100k times) this is just not very efficient
How can i improve this code.
for line in incsv:
if float(line[-1]) > random.random():
I can use numpy etc. whatever
More information about the Python-list
mailing list