Adding a "suffle=True" kwarg to numpy.random.Generator.choice
data:image/s3,"s3://crabby-images/c3c30/c3c3055c53fd0355e7317c7ba6fe44513e78ff96" alt=""
In PR https://github.com/numpy/numpy/pull/13812, Thrasibule rewrote the algorithm used with a faster alternative branch for some cases. The faster algorithm does not necessarily shuffle the results, so for instance gen.choice(2000, 2000, replace=False) may simply return arange(2000). In the old code the result is always shuffled. We propose adding a new kwarg "shuffle" that defaults to True. Users looking for maximum performance may choose to use shuffle=False. Since this is a behavioral change (although only in the new Generator class, the new code will not be used in RandomState), we are proposing it to the mailing list Any thoughts? Matti
data:image/s3,"s3://crabby-images/dcdbd/dcdbd8ddec664b034475bdd79a7426bde32cc735" alt=""
This sounds like a welcome backwards compatible option for more performance. I imagine there are plenty of applications (e.g., sets) where shuffled order doesn't matter. +1 from me. On Tue, Jul 9, 2019 at 5:32 PM Matti Picus <matti.picus@gmail.com> wrote:
In PR https://github.com/numpy/numpy/pull/13812, Thrasibule rewrote the algorithm used with a faster alternative branch for some cases. The faster algorithm does not necessarily shuffle the results, so for instance gen.choice(2000, 2000, replace=False) may simply return arange(2000). In the old code the result is always shuffled. We propose adding a new kwarg "shuffle" that defaults to True. Users looking for maximum performance may choose to use shuffle=False.
Since this is a behavioral change (although only in the new Generator class, the new code will not be used in RandomState), we are proposing it to the mailing list
Any thoughts?
Matti
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
participants (2)
-
Matti Picus
-
Stephan Hoyer