On Sat, May 02, 2020 at 04:58:43PM +0200, Alex Hall wrote:
I didn't think carefully about this implementation and thought that there was only a performance cost in the error case. That's obviously not true - there's an `if` statement executed in Python for every item in every iterable.
Sorry, this does not demonstrate that the performance cost is significant. This adds one "if" per loop, terminating on (one more than) the shortest input. So O(N) on the length of the input. That's usually considered reasonable, provided the per item cost is low. The test in the "if" is technically O(N) on the number of input iterators, but since that's usually two, and rarely more than a handful, it's close enough to a fixed cost. On my old and slow PC `sentinel in combo` is quite fast: py> from timeit import Timer py> t = Timer('sentinel in combo', setup='sentinel=object(); combo=tuple(range(10))') py> t.repeat() # default is 1000000 loops [1.6585235428065062, 1.6372932828962803, 1.6347543047741055, 1.6457603527233005, 1.6405461430549622] So that's about 1.6 nanoseconds extra per loop on my PC. (For the sake of comparison, unpacking the tuple into separate variables costs about 0.6ns on my machine; so does calling len().) I would expect most people running this on a newer PC to get one tenth of that, or even 1/100, but let's assume a machine even slower and older than mine, and call it 3ns to be safe. What are you doing inside the loop with the zipped up items that 3ns is a serious performance bottleneck for your application?
The overhead is O(len(iterables) * len(iterables[0])). Given that zip is used a lot and most uses of zip should probably be strict,
That's not a given. I would say that most uses of zip should not be strict.
this is a significant problem.
Without actual measurements, this is a classic example of premature micro-optimization. Let's see some real benchmarks proving that a Python version is too slow in real-life code first. -- Steven