# [Numpy-discussion] Is there a more efficient way to do this?

Brett Olsen brett.olsen at gmail.com
Wed Aug 8 11:41:05 EDT 2012

```On Wed, Aug 8, 2012 at 9:19 AM, Laszlo Nagy <gandalf at shopzeus.com> wrote:
> Is there a more efficient way to calculate the "slices" array below?
>
> I do not want to make copies of DATA, because it can be huge. The
> argsort is fast enough. I just need to create slices for different
> dimensions. The above code works, but it does a linear time search,
> implemented in pure Python code. For every iteration, Python code is
> executed. For 1 million rows, this is very slow. Is there a way to
> produce "slices" with numpy code? I could write C code for this, but I
> would prefer to do it with mass numpy operations.
>
> Thanks,
>
>    Laszlo

#Code
import numpy as np

#rows between 100 to 1M
rows = 1000
data = np.random.random_integers(0, 100, rows)

def get_slices_slow(data):
o = np.argsort(data)

slices = []
prev_val = None
sidx = -1

for oidx, rowidx in enumerate(o):
val = data[rowidx]
if not val == prev_val:
if prev_val is None:
prev_val = val
sidx = oidx
else:
slices.append((prev_val, sidx, oidx))
sidx = oidx
prev_val = val

if (sidx >= 0) and (sidx < rows):
slices.append((val, sidx, rows))
slices = np.array(slices, dtype=np.int64)
return slices

def get_slices_fast(data):
nums = np.unique(data)
slices = np.zeros((len(nums), 3), dtype=np.int64)
slices[:,0] = nums
count = 0
for i, num in enumerate(nums):
count += (data == num).sum()
slices[i,2] = count
slices[1:,1] = slices[:-1,2]
return slices

def get_slices_faster(data):
nums = np.unique(data)
slices = np.zeros((len(nums), 3), dtype=np.int64)
slices[:,0] = nums
count = np.bincount(data)
slices[:,2] = count.cumsum()
slices[1:,1] = slices[:-1,2]
return slices

#Testing in ipython
In [2]: (get_slices_slow(data) == get_slices_fast(data)).all()
Out[2]: True

In [3]: (get_slices_slow(data) == get_slices_faster(data)).all()
Out[3]: True

In [4]: timeit get_slices_slow(data)
100 loops, best of 3: 3.51 ms per loop

In [5]: timeit get_slices_fast(data)
1000 loops, best of 3: 1.76 ms per loop

In [6]: timeit get_slices_faster(data)
10000 loops, best of 3: 116 us per loop

So using the fast bincount and array indexing methods gets you about a
factor of 30 improvement.  Even just doing the counting in a loop with
good indexing will get you a factor of 2.

~Brett

```