[Python-ideas] Shuffled
Tim Peters
tim.peters at gmail.com
Thu Sep 8 13:05:56 EDT 2016
[Arek Bulski <arek.bulski at gmail.com>]
> I dont randomize test order. People should stop assuming that they know
> better.
They should, but, to be fair, you've left them little choice because
you've been so telegraphic about why you think this is a compelling
idea. Others are trying to fill in the blanks, and can only guess.
> I need to randomize some arguments for one particular test and I
> cannot call shuffle between tests. Its a continous list of declarative
> tests. Needs to be shuffled().
I'm convinced you want to do that. Fine. The question is whether
this is a _common enough_ need to justify all the ongoing costs of
building it into a standard library. But in the only actual code you
posted a link to, you don't even do it yourself. That just _enhances_
the perception that it's rarely needed.
A question: do you know why `product()` isn't built in? Like
product([2, 3, 4]) == 24? `sum()` is built in. So are `any()` and
`all()` (other kinds of specialized reduction operations). Why does
Python pick on multiplication as "unworthy" of the same kind of
shortcut?
The objections to product() were much the same as the ones being
raised against shuffled(): sure, it would be useful at times in some
contexts, but it's _rarely_ needed. It's also easy to write your own
in the seemingly rare cases it is useful.
> https://github.com/construct/construct/blob/master/tests/test_all.py
> See? No way to put imperative code between tests.
But there's also no example of anything in that code that requires
shuffling. "But shuffled() isn't in the std lib!" isn't relevant: if
you really _needed_ shuffling in test_all.py, you already know how to
write such function, so if you really needed it you would have done so
already. "Use cases" are about actual needs, not speculating about
"maybe nice to have some day".
>> sample(container, len(container))
> That wont work because I would have to type the expression that is used as
> argument twice in a test. I need shuffled. Enough said.
shuffled = lambda container: random.sample(container, len(container))
Now you have _a_ way to supply a `shuffled()` function (although I'd
strongly recommend writing it in an obvious way building on
random.shuffle()). But to justify becoming part of the standard
distribution, it's not enough that _you_ "need" it: lots of people
have to find it attractive. That's the same bar `sum()`, `any()`,
`all()`, `sorted()` and `reversed()` (for examples) had to leap over.
They were all popular ideas at the times they were proposed, and they
indeed went on to become widely used in all kinds of code.
I understand that you're proposing to make `shuffled()` a `Random`
method instead of a builtin function, but that doesn't lower the bar
much. I expect most people would say there's already "too much"
rarely used stuff in Random.
> I dont have a use case for half of what the std library offers. Or for type
> annotations. Asynchronous comprehesions, what is that? Do you see me
> rejecting those?
I do not. Do you reject `product()`? You should ;-)
But it's not about just you, and there's nothing personal about it.
The examples you listed have many strong advocates, and "many" and
"strong" both played roles in their acceptance. In contrast, the
constituency for "shuffled()" appears small.
>> (sample having default of entire list size)
> That would work but would not be pretty. shuffled() is self explanatory and
> has a nice ring to it. Randomized list is not a sample by definition.
sample() in this context uses a different spelling of the same
algorithm shuffle() uses, but wasn't _intended_ for sorting. Using
sample() for this is "a trick", and I wouldn't use it either. But I
would use:
def shuffled(xs):
xs = list(xs)
random.shuffle(xs)
return xs
I don't care that there's an implicit dependence on a shared Random
instance, and neither will 99+% of other users. The few who bother
with creating their own Random instances can write their own
`shuffled()` to use them instead. Assuming the intersection of two
small sets isn't empty ;-)
More information about the Python-ideas
mailing list