On Apr 26, 2020, at 14:36, Daniel Moisset <dfmoisset@gmail.com> wrote:
This idea is something I could have used many times. I agree with many people here that the
strict=True API is at least "unusual" in Python. I was thinking of 2 different API approaches that could be used for this and I think no one has mentioned:
- we could add a callable filler_factory keyword argument to zip_longest. That would allow passing a function that raises an exception if I want "strict" behaviour, and also has some other uses (for example, if I want to use [] as a filler value, but not the *same* empty list for all fillers)
This could be useful, and doesn’t seem too bad.
I still think an itertools.zip_equal would be more discoverable and more easily understandable than something like itertools.zip_longest(fill_factory=lambda: throw(ValueError)), especially since you have to write that thrower function yourself. But if there really are other common uses like zip_longest(fill_factory=list), that might make up for it.
- we could add methods to the zip() type that provide different behaviours. That way you could use zip(seq, seq2).shortest(), zip(seq1, seq2).equal(), zip(seq1, seq2).longer(filler="foo") ; zip(...).shortest() would be equivalent to zip(...). Other names might work better with this API, I can think of zip(...).drop_tails(), zip(...).consume_all() and zip(...).fill(). This also allows adding other possible behaviours (I wouldn't say it's common, but at least once I've wanted to zip lists of different length, but get shorter tuples on the tails instead of fillers).
This second one is a cool idea—but your argument for it seems to be an argument against it.
If we stick with separate functions in itertools, and then we add a new one for your zip_skip (or whatever you’d call it) in 3.10, the backport is trivial. Either more-itertools adds zip_skip, or someone writes an itertools310 library with the new functions in 3.10, and then people just do this:
try:
from itertools import zip_skip
except ImportError:
from more_itertools import zip_skip
But if we add methods on zip objects, and then we add a new skip() method in 3.10, how does the backport work? It can’t monkeypatch the zip type (unless we both make the type public and specifically design it to be monkeypatchable, which C builtins usually aren’t). So more-itertools or zip310 or whatever has to provide a full implementation of the zip type, with all of its methods, and probably twice (in Python for other implementations plus a C accelerator for CPython). Sure, maybe it could delegate to a real zip object for the methods that are already there, but that’s still not trivial (and adds a performance cost).
Also, what exactly do these methods return? Do they set some flag and return self? If so, that goes against the usual Python rule that mutator methods return None rather than self. Plus, it opens the question of what zip(xs, ys).equal().shortest() should do. I think you’d want that to be an AttributeError, but the only sensible way to get that is if equal() actually returns a new object of a new zip_equal type rather than self. So, that solves both problems, but it means you have to implement four different builtin types. (Also, while the C implementation of those types, and constructing them from the zip type’s methods, seems trivial, I think the pure Python version would have to be pretty clunky.)