[Numpy-discussion] Optimization of loops
Pierre Yger
yger at unic.cnrs-gif.fr
Wed Oct 22 12:21:48 EDT 2008
Hi all,
This is my first mail to the mailing list, and I would like to know if anybody
has a great idea about the use or not of Numpy and loops in Python.
So here is my problem : I've a large list of tuple (id, time),
id being integer between [0, ..., N] and time float values.
I want to have a mysort() function that will be able to explode this list into
N lists of differents sizes, that will contained the times associated to each
id.
Example:
>> spikes = [(0, 2.3),(1, 5.6),(3, 2.5),(0, 5.2),(3, 10.2),(2, 16.2)]
mysort(spikes)
should return:
[[2.3, 5.2], [5.6], [16.2], [2.5, 10.2]]
Intuitively, the simplest way to do that is to append elements while going
through all the tuples of the main list (called spikes) to empty lists:
res = [[] for i in xrange(N)]
for id, time in my_list:
res[id].append(time)
But this loop seems to be incredibly slow for large lists !
A faster way (after having performed some profiling) seems to do:
spikes = numpy.array(spikes) # Convert the list into a numpy array
res = []
for id in xrange(N):
res.append(spikes[spikes[:,0] == id, 1]) # Use Numpy indexes
Nevertheless, this is still rather slow. Does anybody have any idea about a
faster way to do this ? Is there a Numpy function that could be used ?
Thanks in advance,
Pierre
More information about the NumPy-Discussion
mailing list