[Numpy-discussion] Request for enhancement to numpy.random.shuffle

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Oct 16 21:35:35 EDT 2014


On Thu, Oct 16, 2014 at 3:39 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Oct 16, 2014 at 6:30 PM, Warren Weckesser
> <warren.weckesser at gmail.com> wrote:
>>
>>
>> On Thu, Oct 16, 2014 at 12:40 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>> On Thu, Oct 16, 2014 at 4:39 PM, Warren Weckesser
>>> <warren.weckesser at gmail.com> wrote:
>>> >
>>> > On Sun, Oct 12, 2014 at 9:13 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> >>
>>> >> Regarding names: shuffle/permutation is a terrible naming convention
>>> >> IMHO and shouldn't be propagated further. We already have a good
>>> >> naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
>>> >> reversed, etc.
>>> >>
>>> >> So, how about:
>>> >>
>>> >> scramble + scrambled shuffle individual entries within each
>>> >> row/column/..., as in Warren's suggestion.
>>> >>
>>> >> shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
>>> >> these break a 2d array into a bunch of 1d "cards", and then shuffle
>>> >> those cards).
>>> >>
>>> >> permuted remains indefinitely, with the docstring: "Deprecated alias
>>> >> for 'shuffled'."
>>> >
>>> > That sounds good to me.  (I might go with 'randomize' instead of
>>> > 'scramble',
>>> > but that's a second-order decision for the API.)
>>>
>>> I hesitate to use names like "randomize" because they're less
>>> informative than they feel seem -- if asked what this operation does
>>> to an array, then it would be natural to say "it randomizes the
>>> array". But if told that the random module has a function called
>>> randomize, then that's not very informative -- everything in random
>>> randomizes something somehow.
>>
>> I had some similar concerns (hence my original "disarrange"), but
>> "randomize" seemed more likely to be found when searching or browsing the
>> docs, and while it might be a bit too generic-sounding, it does feel like a
>> natural verb for the process.   On the other hand, "permute" and "permuted"
>> are even more natural and unambiguous.  Any objections to those?  (The
>> existing function is "permutation".)
> [...]
>> By the way, "permutation" has a feature not yet mentioned here: if the
>> argument is an integer 'n', it generates a permutation of arange(n).  In
>> this case, it acts like matlab's "randperm" function.  Unless we replicate
>> that in the new function, we shouldn't deprecate "permutation".
>
> I guess we could do something like:
>
> permutation(n):
>
> Return a random permutation on n items. Equivalent to permuted(arange(n)).
>
> Note: for backwards compatibility, a call like permutation(an_array)
> currently returns the same as shuffled(an_array). (This is *not*
> equivalent to permuted(an_array).) This functionality is deprecated.
>
> OTOH "np.random.permute" as a name does have a downside: someday we'll
> probably add a function called "np.permute" (for applying a given
> permutation in place -- the O(n) algorithm for this is useful and
> tricky), and having two functions with the same name and very
> different semantics would be pretty confusing.

I like `permute`. That's the one term I'm looking for first.

If np.permute does some kind of deterministic permutation or pivoting,
then I wouldn't find it confusing if np.random.permute does "random"
permutation.

(I definitely don't like scrambled, sounds like eggs or cable TV that
needs to be unscrambled.)

Josef


>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list