On Sat, May 02, 2020 at 02:58:57PM +0200, Alex Hall wrote:
Adding a function to itertools will mostly solve that problem, but not entirely. Adding `strict=True` is so easy that people will be encouraged to use it often and keep their code safe. That to me is the biggest argument for this feature and for this specific API.
The last thing I want is to encourage people to unnecessarily enforce a rule "data streams must be equal just to be safe" when they don't actually need to be equal. What you are calling the biggest argument for this feature is, for me, a strong argument against it. If I know that consumers of my data truncate on the shortest input, then as the producer of data I don't have to care about making them equal. I can say: process_user_ids(usernames, generate_user_ids()) and pass an infinite stream of user IDs and know that the function will just truncate on the shortest stream. Yay! Life is good. But now this zip flag comes along, and the author of process_user_ids decides to protect me from myself and "fail loudly", and I will curse them onto the hundredth generation for making my life more difficult. If I'm the producer and consumer of the data, then I can pick and choose between versions, and that's all well and good. If I'm the producer of the data, and I want it to be equal in length, then I control the data and can make it equal. I don't need the consumer's help, and I don't need zip to have a flag. But if I'm only the consumer of the data, I have no business failing "just to be safe". That's an anti-feature that makes life more difficult, not less, for the producer of the data, akin to excessive runtime type checking (I'm sometimes guilty of that myself) or in other languages flagging every class in sight as final "just to be safe". It is possible to be *too* defensive, and if making the strict version of zip a builtin encourages consumers of the data to "be safe", then that is a mark against it in my strong opinion. -- Steven