[Numpy-discussion] Request for enhancement to numpy.random.shuffle
josef.pktd at gmail.com
josef.pktd at gmail.com
Thu Oct 16 21:35:35 EDT 2014
On Thu, Oct 16, 2014 at 3:39 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Oct 16, 2014 at 6:30 PM, Warren Weckesser
> <warren.weckesser at gmail.com> wrote:
>> On Thu, Oct 16, 2014 at 12:40 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> On Thu, Oct 16, 2014 at 4:39 PM, Warren Weckesser
>>> <warren.weckesser at gmail.com> wrote:
>>> > On Sun, Oct 12, 2014 at 9:13 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> >> Regarding names: shuffle/permutation is a terrible naming convention
>>> >> IMHO and shouldn't be propagated further. We already have a good
>>> >> naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
>>> >> reversed, etc.
>>> >> So, how about:
>>> >> scramble + scrambled shuffle individual entries within each
>>> >> row/column/..., as in Warren's suggestion.
>>> >> shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
>>> >> these break a 2d array into a bunch of 1d "cards", and then shuffle
>>> >> those cards).
>>> >> permuted remains indefinitely, with the docstring: "Deprecated alias
>>> >> for 'shuffled'."
>>> > That sounds good to me. (I might go with 'randomize' instead of
>>> > 'scramble',
>>> > but that's a second-order decision for the API.)
>>> I hesitate to use names like "randomize" because they're less
>>> informative than they feel seem -- if asked what this operation does
>>> to an array, then it would be natural to say "it randomizes the
>>> array". But if told that the random module has a function called
>>> randomize, then that's not very informative -- everything in random
>>> randomizes something somehow.
>> I had some similar concerns (hence my original "disarrange"), but
>> "randomize" seemed more likely to be found when searching or browsing the
>> docs, and while it might be a bit too generic-sounding, it does feel like a
>> natural verb for the process. On the other hand, "permute" and "permuted"
>> are even more natural and unambiguous. Any objections to those? (The
>> existing function is "permutation".)
>> By the way, "permutation" has a feature not yet mentioned here: if the
>> argument is an integer 'n', it generates a permutation of arange(n). In
>> this case, it acts like matlab's "randperm" function. Unless we replicate
>> that in the new function, we shouldn't deprecate "permutation".
> I guess we could do something like:
> Return a random permutation on n items. Equivalent to permuted(arange(n)).
> Note: for backwards compatibility, a call like permutation(an_array)
> currently returns the same as shuffled(an_array). (This is *not*
> equivalent to permuted(an_array).) This functionality is deprecated.
> OTOH "np.random.permute" as a name does have a downside: someday we'll
> probably add a function called "np.permute" (for applying a given
> permutation in place -- the O(n) algorithm for this is useful and
> tricky), and having two functions with the same name and very
> different semantics would be pretty confusing.
I like `permute`. That's the one term I'm looking for first.
If np.permute does some kind of deterministic permutation or pivoting,
then I wouldn't find it confusing if np.random.permute does "random"
(I definitely don't like scrambled, sounds like eggs or cable TV that
needs to be unscrambled.)
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
More information about the NumPy-Discussion