That was a good email Alex. Besides the relevant examples, you've put into words things that I wanted to say but didn't realize it. Good job :)

On Sat, May 2, 2020 at 4:00 PM Alex Hall <alex.mojaki@gmail.com> wrote:
On Sat, May 2, 2020 at 1:19 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, May 02, 2020 at 09:54:46AM +0200, Alex Hall wrote:

> I would say that in pretty much all cases you wouldn't catch the exception.
> It's the producer's responsibility to produce correct inputs, and if they
> don't, tell them that they failed in their responsibility.
>
> The underlying core principle is that programs should fail loudly when
> users make mistakes to help them find those mistakes.

Maybe. It depends on whether it is a meaningful mistake, and the cost of
the loud failure versus the usefulness of silent truncation.

I'm not sure what the point of this long spiel about floats and str.upper was. No one thinks that zip should always be strict. The feature would be optional and let people choose conveniently between loud failure and silent truncation.

So bringing it back to zip... I don't think I ever denied that, in
principle at least, somebody might need to raise on mismatched lengths.
(If I did give that impression, I apologise.) I did say I never needed
it myself, and my own zip_strict function in my personal toolbox remains
unused after many years. But somebody needs it? Sure, I'll accept that.

But I question whether *enough* people need it *often enough* to make it
a builtin, or to put a flag on plain zip.

Well, let's add some data about people needing it.


Here are previous threads asking for it:


(In that one you yourself say "Indeed. The need is real, and the question has come up many times on 
Python-List as well.")



Here are similar requests for Rust:


(which mentions that Erlang's zip is strict)

Rolling your own on top of
zip_longest is easy. It's half a dozen lines. It could be a recipe in
itertools, or a function. 

It has taken years for it to be added to more-itertools, suggesting that
the real world need for this is small.

"Not every two line function needs to be a builtin" -- this is six
lines, not two, which is in the proposal's favour, but the principle
still applies. Before this becomes a builtin, there are a number of
hurdles to pass:

- Is there a need for it? Granted.
- Is it complicated to get right? No.

I would say yes. Look at the SO question for example. The asker wrote a long, slow, complicated solution and had to ask if it was good enough. Martjin (who is a prolific answerer) gave two solutions. The top comment says that the second solution is very nice. Months later someone pointed out that the second solution is actually buggy, so it was edited out. The remaining solution still has an issue which is mentioned in a comment but is not addressed. So we know that many people (including me, btw) have copy pasted this buggy code and it's now sitting in their codebases. Here are some examples from github:

 
- Is performance critical enough that it has to be written in C?
  Probably not.

No, probably not, but I don't see why this is a hurdle. This can be implemented in any way by different implementations of Python, but for CPython, I don't see how else this would play out. Performance isn't really the reason this should be in the language.
 
- Is there agreement on the functionality? Somewhat.
- Could that need be met by your own personal toolbox?
- or a recipe in itertools?
- or by a third-party library?
- or a function in itertools?

We've heard from people who say that they would like a strict version
of zip which raises on unequal inputs. How many of them like this enough
to add a six line function to their code?

I think a major factor here is laziness. I'm pretty sure that sometimes I've wanted this kind of strict check, just for better peace of mind, but the thought of one of the solutions above feels like too much effort. I don't want to add a third party dependency just for this. I don't want to read someone else's solution (e.g. on SO) which doesn't have tests and try to evaluate if it's correct. I certainly don't want to reimplement it myself. I brush it off thinking "it'll probably be fine", which is bad behaviour.

The problem is that no one really *needs* this check. You *can* do without it. The same doesn't apply well to other functions in itertools or more-itertools. If you need the functionality of itertools.permutations(), you can't just dismiss that problem.

But sometimes you may seriously regret not having checked the lengths. That's usually the point when someone (e.g. Ram) comes to python-ideas, or more-itertools, or stackoverflow, wishing they had been more disciplined in the past.

Adding a function to itertools will mostly solve that problem, but not entirely. Adding `strict=True` is so easy that people will be encouraged to use it often and keep their code safe. That to me is the biggest argument for this feature and for this specific API.
 
> The problem is not that they have to look there, it's that they have to
> *think to look there*. itertools might not occur to them. They might not
> even know it exists.

Yes? Is it our responsibility to put everything in builtins because
people might not think to look in math, or functools, or os, or sys?

Putting math.sin or whatever in builtins makes builtins bigger. Adding a flag to zip does not.

I think I've missed what harm you think it will do to add a flag to zip. Can you point me to your objection?
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C5E6GVAMKWTMYKBDBDQ4D6UEPUGVSANQ/
Code of Conduct: http://python.org/psf/codeofconduct/