On Mon, May 4, 2020 at 2:33 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, May 02, 2020 at 07:43:44PM +0200, Alex Hall wrote:
> On Sat, May 2, 2020 at 6:09 PM Steven D'Aprano <steve@pearwood.info> wrote:
>
> > On Sat, May 02, 2020 at 04:58:43PM +0200, Alex Hall wrote:
> >
> > > I didn't think carefully about this implementation and thought that there
> > > was only a performance cost in the error case. That's obviously not true
> > -
> > > there's an `if` statement executed in Python for every item in every
> > > iterable.
> >
> > Sorry, this does not demonstrate that the performance cost is
> > significant.
> >
> > This adds one "if" per loop, terminating on (one more than) the shortest
> > input. So O(N) on the length of the input. That's usually considered
> > reasonable, provided the per item cost is low.
> >
> > The test in the "if" is technically O(N) on the number of input
> > iterators, but since that's usually two, and rarely more than a handful,
> > it's close enough to a fixed cost.
> >
> > On my old and slow PC `sentinel in combo` is quite fast:
> >
>
> `sentinel in combo` is problematic if some values have overridden `__eq__`.
> I referred to this problem in a previous email to you, saying that people
> had copied this buggy implementation from SO and that it still hadn't been
> fixed after being pointed out. The fact that you missed this helps to prove
> my point. Getting this right is hard.

I didn't miss it, I ignored it as YAGNI.

Seriously, if some object defines a weird `__eq__` then half the
standard library, including builtins, stops working "correctly". See for
example the behaviour of float NANs in lists.

My care factor for this is negligible, until such time that it is proven
to be an issue for real objects in real code. Until then, YAGNI.

Here is an example:

```
import numpy as np

from itertools import zip_longest


def zip_equal(*iterables):
    sentinel = object()
    for combo in zip_longest(*iterables, fillvalue=sentinel):
        if sentinel in combo:
            raise ValueError('Iterables have different lengths')
        yield combo


arr = np.arange(8).reshape((2, 2, 2))
print(arr)
print(list(zip(*arr)))
print(list(zip_equal(*arr)))
```

The output:

```
[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]
[(array([0, 1]), array([4, 5])), (array([2, 3]), array([6, 7]))]
Traceback (most recent call last):
  File "/home/alex/.config/JetBrains/PyCharm2020.1/scratches/scratch_666.py", line 15, in <module>
    print(list(zip_equal(*arr)))
  File "/home/alex/.config/JetBrains/PyCharm2020.1/scratches/scratch_666.py", line 8, in zip_equal
    if sentinel in combo:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
```

I know for a fact that this would confuse people badly because I've seen multiple people who know what this error message generally refers to incorrectly identify where exactly it's coming from in a similar case: https://stackoverflow.com/questions/60780328/python-valueerror-the-truth-value-of-an-array-with-more-than-one-element-is-amb/60780361