[Numpy-discussion] Optimization of loops

Charles R Harris charlesr.harris at gmail.com
Thu Oct 23 13:52:16 EDT 2008


On Wed, Oct 22, 2008 at 10:21 AM, Pierre Yger <yger at unic.cnrs-gif.fr> wrote:

> Hi all,
>
> This is my first mail to the mailing list, and I would like to know if
> anybody
> has a great idea about the use or not of Numpy and loops in Python.
>
> So here is my problem : I've a large list of tuple (id, time),
> id being integer between [0, ..., N] and time float values.
> I want to have a mysort() function that will be able to explode this list
> into
> N lists of differents sizes, that will contained the times associated to
> each
> id.
>
> Example:
>
> >> spikes = [(0, 2.3),(1, 5.6),(3, 2.5),(0, 5.2),(3, 10.2),(2, 16.2)]
>
> mysort(spikes)
>
> should return:
>
> [[2.3, 5.2], [5.6], [16.2], [2.5, 10.2]]
>
> Intuitively, the simplest way to do that is to append elements while going
> through all the tuples of the main list (called spikes) to empty lists:
>
> res = [[] for i in xrange(N)]
>
> for id, time in my_list:
>         res[id].append(time)
>
> But this loop seems to be incredibly slow for large lists !
>
> A faster way (after having performed some profiling) seems to do:
> spikes = numpy.array(spikes) # Convert the list into a numpy array
> res = []
> for id in xrange(N):
>         res.append(spikes[spikes[:,0] == id, 1]) # Use Numpy indexes
>
> Nevertheless, this is still rather slow. Does anybody have any idea about a
> faster way to do this ? Is there a Numpy function that could be used ?
>

If you want to stick to lists you can try something like

In [1]: import bisect

In [2]: spikes = [(0, 2.3),(1, 5.6),(3, 2.5),(0, 5.2),(3, 10.2),(2, 16.2)]

In [3]: spikes.sort()

In [4]: ind = [i for i,j in spikes]

In [5]: tim = [j for i,j in spikes]

In [6]: mylist = []

In [7]: for i in xrange(4) :
   ...:     beg = bisect.bisect_left(ind,i)
   ...:     end = bisect.bisect_right(ind,i)
   ...:     mylist.append(tim[beg:end])
   ...:

In [8]: mylist
Out[8]:
[[2.2999999999999998, 5.2000000000000002],
 [5.5999999999999996],
 [16.199999999999999],
 [2.5, 10.199999999999999]]


There is a numpy version of this also which might be faster if your lists
are really huge.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20081023/233bb527/attachment.html>


More information about the NumPy-Discussion mailing list