[Numpy-discussion] Request for enhancement to numpy.random.shuffle

Warren Weckesser warren.weckesser at gmail.com
Thu Oct 16 11:39:12 EDT 2014


On Sun, Oct 12, 2014 at 9:13 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Oct 12, 2014 at 5:14 PM, Sebastian <sebix at sebix.at> wrote:
> >
> > On 2014-10-12 16:54, Warren Weckesser wrote:
> >>
> >>
> >> On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern <robert.kern at gmail.com
> >> <mailto:robert.kern at gmail.com>> wrote:
> >>
> >>     On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
> >>     <warren.weckesser at gmail.com <mailto:warren.weckesser at gmail.com>>
> >>     wrote:
> >>
> >>     > A small wart in this API is the meaning of
> >>     >
> >>     >   shuffle(a, independent=False, axis=None)
> >>     >
> >>     > It could be argued that the correct behavior is to leave the
> >>     > array unchanged. (The current behavior can be interpreted as
> >>     > shuffling a 1-d sequence of monolithic blobs; the axis argument
> >>     > specifies which axis of the array corresponds to the
> >>     > sequence index.  Then `axis=None` means the argument is
> >>     > a single monolithic blob, so there is nothing to shuffle.)
> >>     > Or an error could be raised.
> >>     >
> >>     > What do you think?
> >>
> >>     It seems to me a perfectly good reason to have two methods instead
> of
> >>     one. I can't imagine when I wouldn't be using a literal True or
> False
> >>     for this, so it really should be two different methods.
> >>
> >>
> >>
> >> I agree, and my first inclination was to propose a different method
> >> (and I had the bikeshedding conversation with myself about the name:
> >> "disarrange", "scramble", "disorder", "randomize", "ashuffle", some
> >> other variation of the word "shuffle", ...), but I figured the first
> >> thing folks would say is "Why not just add options to shuffle?"  So,
> >> choose your battles and all that.
> >>
> >> What do other folks think of making a separate method
> > I'm not a fan of more methods with similar functionality in Numpy. It's
> > already hard to overlook the existing functions and all their possible
> > applications and variants. The axis=None proposal for shuffling all
> > items is very intuitive.
> >
> > I think we don't want to take the path of matlab: a huge amount of
> > powerful functions, but few people know of their powerful possibilities.
>
> I totally agree with this principle, but I think this is an exception
> to the rule, b/c unfortunately in this case the function that we *do*
> have is weird and inconsistent with how most other functions in numpy
> work. It doesn't vectorize! Cf. 'sort' or how a 'shuffle' gufunc
> (k,)->(k,) would work. Also, it's easy to implement the current
> 'shuffle' in terms of any 1d shuffle function, with no explicit loops,
> Warren's disarrange requires an explicit loop. So, we really
> implemented the wrong one, oops. What this means going forward,
> though, is that our only options are either to implement both
> behaviours with two functions, or else to give up on have the more
> natural behaviour altogether. I think the former is the lesser of two
> evils.
>
> Regarding names: shuffle/permutation is a terrible naming convention
> IMHO and shouldn't be propagated further. We already have a good
> naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
> reversed, etc.
>
> So, how about:
>
> scramble + scrambled shuffle individual entries within each
> row/column/..., as in Warren's suggestion.
>
> shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
> these break a 2d array into a bunch of 1d "cards", and then shuffle
> those cards).
>
> permuted remains indefinitely, with the docstring: "Deprecated alias
> for 'shuffled'."
>
>

That sounds good to me.  (I might go with 'randomize' instead of
'scramble', but that's a second-order decision for the API.)

Warren


-n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141016/24d4dca9/attachment.html>


More information about the NumPy-Discussion mailing list