[Numpy-discussion] numpy.random.shuffle

Wed Nov 22 12:48:05 EST 2006

Tim Hochberg wrote:
> Robert Kern wrote:

>> One possibility is to check if the object is an ndarray (or subclass) and use
>> .copy() if so; otherwise, use the current implementation and hope that you
>> didn't pass it a Numeric or numarray array (or some other view-based object).
>>   
> I think I would invert this test and instead check if the object is a 
> Python list and *not* copy in that case. Otherwise, use copy.copy to 
> copy the object whatever it is. This looks like it would be more robust 
> in that it would work in all sensible case, and just be a tad slower in 
> some of them.

I don't want to assume that the only two sequence types are lists and arrays.
The problem with using copy.copy() on non-arrays is that it, well, makes copies
of the elements. The objects in the shuffled sequence are not the same objects
before and after the shuffling. I consider that to be a violation of the spec.

Views are rare outside of numpy/Numeric/numarray, partially because Guido
considers them to be evil. I'm beginning to see why.

> Another possible refinement / complication would be to special case 1D 
> arrays so that they run fastish.
> 
> A third possibility involves rewriting this in this form:
> 
>     indices = arange(len(x))
>     _shuffle_core(indices) # This just does what current shuffle now does
>     x[:] = take(x, indices, 0)

That's problematic since the elements all turn into numpy scalar objects:

In [1]: from numpy import *

In [2]: a = range(9,-1,-1)

In [3]: idx = arange(len(a))

In [4]: a[:] = take(a, idx, 0)

In [5]: a
Out[5]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [6]: type(a[0])
Out[6]: <type 'numpy.int32'>

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco