time consuming loops over lists
Peter Otten
__peter__ at web.de
Tue Jun 7 12:47:00 EDT 2005
querypk at gmail.com wrote:
> Can some one help me improve this block of code...this jus converts the
> list of data into tokens based on the range it falls into...but it
> takes a long time.Can someone tell me what can i change to improve
> it...
>
> def Tkz(tk,data):
> no_of_bins = 10
> tkns = []
> dmax = max(data)+1
> dmin = min(data)
> rng = ceil(abs((dmax - dmin)/(no_of_bins*1.0)))
> rngs = zeros(no_of_bins+1)
> for i in xrange(no_of_bins+1):
> rngs[i] = dmin + (rng*i)
> for i in xrange(len(data)):
> for j in xrange(len(rngs)-1):
> if data[i] in xrange(rngs[j],rngs[j+1]):
> tkns.append( str(tk)+str(j) )
> return tkns
Use bisect(), e. g., with a slightly modified function signature:
from __future__ import division
import bisect
from math import ceil
def tkz(tk, data, no_of_bins=10):
dmax = max(data) + 1
dmin = min(data)
rng = ceil((dmax - dmin)/no_of_bins)
rngs = [dmin + rng*i for i in xrange(1, no_of_bins+1)]
tokens = [tk + str(i) for i in xrange(no_of_bins)]
return [tokens[bisect.bisect(rngs, v)] for v in data]
if __name__ == "__main__":
print tkz("token_", [5, 7, 8, 9, 70, 200])
What are the tokens for, by the way? I'd recommend using the indices
directly if possible.
Peter
More information about the Python-list
mailing list