On Sat, May 2, 2020 at 2:58 PM Alex Hall <alex.mojaki@gmail.com> wrote:
On Sat, May 2, 2020 at 1:19 PM Steven D'Aprano <steve@pearwood.info> wrote:
Rolling your own on top of 
zip_longest is easy. It's half a dozen lines. It could be a recipe in
itertools, or a function. 

It has taken years for it to be added to more-itertools, suggesting that
the real world need for this is small.

"Not every two line function needs to be a builtin" -- this is six
lines, not two, which is in the proposal's favour, but the principle
still applies. Before this becomes a builtin, there are a number of
hurdles to pass:

- Is there a need for it? Granted.
- Is it complicated to get right? No.
- Is performance critical enough that it has to be written in C?
  Probably not.

No, probably not

I take it back, performance is a problem worth considering. Here is the more-itertools implementation:

https://github.com/more-itertools/more-itertools/blob/master/more_itertools/more.py#L1420

```
def zip_equal(*iterables):
    """``zip`` the input *iterables* together, but throw an
    ``UnequalIterablesError`` if any of the *iterables* terminate before
    the others.
    """
    for combo in zip_longest(*iterables, fillvalue=_marker):
        for val in combo:
            if val is _marker:
                raise UnequalIterablesError(
                    "Iterables have different lengths."
                )
        yield combo
```

I didn't think carefully about this implementation and thought that there was only a performance cost in the error case. That's obviously not true - there's an `if` statement executed in Python for every item in every iterable. The overhead is O(len(iterables) * len(iterables[0])). Given that zip is used a lot and most uses of zip should probably be strict, this is a significant problem. Therefore:

- Rolling your own on top of zip_longest in six lines is not a solution.
- Using more-itertools is not a solution.
- It's complicated to get right.
- Performance is critical enough to do it in C.