[Numpy-discussion] Request for enhancement to numpy.random.shuffle

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Oct 17 09:04:51 EDT 2014


On Thu, Oct 16, 2014 at 10:50 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, Oct 17, 2014 at 2:35 AM,  <josef.pktd at gmail.com> wrote:
>> On Thu, Oct 16, 2014 at 3:39 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> On Thu, Oct 16, 2014 at 6:30 PM, Warren Weckesser
>>> <warren.weckesser at gmail.com> wrote:
>>>>
>>>>
>>>> On Thu, Oct 16, 2014 at 12:40 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>>>
>>>>> On Thu, Oct 16, 2014 at 4:39 PM, Warren Weckesser
>>>>> <warren.weckesser at gmail.com> wrote:
>>>>> >
>>>>> > On Sun, Oct 12, 2014 at 9:13 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>>> >>
>>>>> >> Regarding names: shuffle/permutation is a terrible naming convention
>>>>> >> IMHO and shouldn't be propagated further. We already have a good
>>>>> >> naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
>>>>> >> reversed, etc.
>>>>> >>
>>>>> >> So, how about:
>>>>> >>
>>>>> >> scramble + scrambled shuffle individual entries within each
>>>>> >> row/column/..., as in Warren's suggestion.
>>>>> >>
>>>>> >> shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
>>>>> >> these break a 2d array into a bunch of 1d "cards", and then shuffle
>>>>> >> those cards).
>>>>> >>
>>>>> >> permuted remains indefinitely, with the docstring: "Deprecated alias
>>>>> >> for 'shuffled'."
>>>>> >
>>>>> > That sounds good to me.  (I might go with 'randomize' instead of
>>>>> > 'scramble',
>>>>> > but that's a second-order decision for the API.)
>>>>>
>>>>> I hesitate to use names like "randomize" because they're less
>>>>> informative than they feel seem -- if asked what this operation does
>>>>> to an array, then it would be natural to say "it randomizes the
>>>>> array". But if told that the random module has a function called
>>>>> randomize, then that's not very informative -- everything in random
>>>>> randomizes something somehow.
>>>>
>>>> I had some similar concerns (hence my original "disarrange"), but
>>>> "randomize" seemed more likely to be found when searching or browsing the
>>>> docs, and while it might be a bit too generic-sounding, it does feel like a
>>>> natural verb for the process.   On the other hand, "permute" and "permuted"
>>>> are even more natural and unambiguous.  Any objections to those?  (The
>>>> existing function is "permutation".)
>>> [...]
>>>> By the way, "permutation" has a feature not yet mentioned here: if the
>>>> argument is an integer 'n', it generates a permutation of arange(n).  In
>>>> this case, it acts like matlab's "randperm" function.  Unless we replicate
>>>> that in the new function, we shouldn't deprecate "permutation".
>>>
>>> I guess we could do something like:
>>>
>>> permutation(n):
>>>
>>> Return a random permutation on n items. Equivalent to permuted(arange(n)).
>>>
>>> Note: for backwards compatibility, a call like permutation(an_array)
>>> currently returns the same as shuffled(an_array). (This is *not*
>>> equivalent to permuted(an_array).) This functionality is deprecated.
>>>
>>> OTOH "np.random.permute" as a name does have a downside: someday we'll
>>> probably add a function called "np.permute" (for applying a given
>>> permutation in place -- the O(n) algorithm for this is useful and
>>> tricky), and having two functions with the same name and very
>>> different semantics would be pretty confusing.
>>
>> I like `permute`. That's the one term I'm looking for first.
>>
>> If np.permute does some kind of deterministic permutation or pivoting,
>> then I wouldn't find it confusing if np.random.permute does "random"
>> permutation.
>
> Yeah, but:
>
> from ... import permute
> # 500 lines later
> def foo(...):
>     permute(...)  # what the heck is this
>
> It definitely *can* be confusing; basically everything else in
> np.random has a name that suggests randomness even without seeing the
> full path.

I usually/always avoid importing names from random into the module namespace

np.random.xxx

from numpy.random import power
power(...)

>>> power(5, 3)
array([ 0.93771162,  0.96180884,  0.80191961])

???

and f and beta and gamma, ...

>>> bytes(10)
'\xa3\xf0%\x88\x11\xda\x0e\x81\x0c\x8e'
>>> bytes(5)
'\xb0B\x8e\xa1\x80'


>
> It's not a huge deal, though.
>
>> (I definitely don't like scrambled, sounds like eggs or cable TV that
>> needs to be unscrambled.)
>
> I vote that in this kind of bikeshed we try to restrict ourselves to
> arguments that we can at least pretend are motivated by some
> technical/UX concern ;-). (I guess unscrambling eggs would be
> technically impressive tho ;-))

Ignoring the eggs, it still sounds like a cheap encryption and is a
word I would never look for when looking for something to implement a
permutation test.

Josef


>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list