On Sat, May 2, 2020 at 3:50 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Apr 30, 2020 at 07:58:16AM -0700, Christopher Barker wrote:
Imagine someone that uses zip() in code that works for a while, and then discovers a bug triggered by unequal length inputs.
If it’s a flag, they look at the zip docstring, and find the flag, and their problem is solved.
Their problem is not solved. All they have is an exception. Now what are they going to do with it?
I *think* Christopher was saying they have a logical bug which existed silently and led to some confusing debugging, and they'd like to be notified of the unequal lengths in the future. So they want to find the strict feature (whatever the API may be) which they've either guessed might exist or vaguely remember seeing before. In that case the zip docstring is likely the first place they'd look. If what he meant was that the flag raised an exception, then to answer your question "what are they going to do with it?", they should either fix the bug that lead to malformed inputs or remove the flag if they realise unequal lengths aren't such a problem in this case.
This is why I am still unconvinced that this functionality is anywhere near as useful as the proponents seem to think. Brandt has found one good example of a parsing bug in the ast library, but if he has shown how this zip_strict function will solve the bug, I haven't seen it.
[The bug](https://bugs.python.org/issue40355) is titled "The ast module fails to reject certain malformed nodes". The function would cause the nodes to be rejected with an exception.
In any case, even giving Brandt the benefit of the doubt that this will solve the ast bug, its hard for me to generalise from that. If I'm expecting equal length inputs, and don't get them, what am I supposed to do with the exception as the consumer of the inputs?
As the consumer of the inputs, I can pass the buck to the producer, make it their responsibility, and merely promise to truncate the inputs if they're not the same length. Otherwise, what do I do once I've caught the exception?
I would say that in pretty much all cases you wouldn't catch the exception. It's the producer's responsibility to produce correct inputs, and if they don't, tell them that they failed in their responsibility. The underlying core principle is that programs should fail loudly when users make mistakes to help them find those mistakes. I'm strongly reminded of when I was advocating for a warning/exception when iterating directly over a string and some people here didn't understand what the point was. Do some people not agree with this core principle?
The most common use for this I have seen in the discussion is:
"I have generated two inputs which I expect are equal, and I'd like to be notified if they aren't"
If there's a different use case I'm not aware of it, can someone share?
which to me is an assertion about program correctness. So this ought to be an assert that gets disabled under -O, not a raise that the caller might catch.
That's a pretty decent idea. But are there any other examples in the standard library of functions behaving differently under -O? I think if you want that kind of balance between performance and robustness, your best option is zip(x, y, strict=__debug__). Nice and explicit.
So this suggests *two* new functions:
- zip_equal for Brandt's parsing bug use-case, guaranteed to raise
- zip_assert_equal for the more common use case of checking program correctness, and disabled under -O
Again, I think Brandt's case is still just about checking program correctness.
Is it’s in itertools, they have to think to look there.
And this is a problem, why? Should *everything* be a builtin?
Heaven forbid that somebody has to read the docs and learn about modules, let's have one giant global namespace with everything in it! Because that's good for the beginners! (Not.)
The problem is not that they have to look there, it's that they have to *think to look there*. itertools might not occur to them. They might not even know it exists. Note that adding a flag is essentially adding to the (empty) namespace that is zip's named arguments. Adding a new function is adding to a much larger namespace, probably itertools.