
On 05/14/2020 11:13 AM, Brandt Bucher wrote:
I claimed to have found "dozens of other call sites in Python's standard library and tooling where it would be appropriate to enable this new feature". You asked for references, and I provided two dozen cases of zipping what must be equal length iterables.
I said they were "appropriate", not "needed" or even "recommended". These are call sites where unequal-length iterables, if encountered, would be an error that I would hope wouldn't pass silently.
Very good point.
Besides, I don't thinkit's beyond the realm of imagination for a future refactoring of severalof the "Mismatch cannot happen." cases to introduce a bug of this kind.
Which seems besides the point. As you say, if the lengths are mismatched then a bug has appeared and if the check is nearly free there's no reason not to do it.
Ethan Furman wrote:
Did you vet them ...
Of course. I spent hours vetting them, to the point of researching the GNU tar extended sparse header and Apple property list formats (and trying to figure out what the hell was happening in `os._fwalk`) just to make sure my understanding was correct.
Glad I'm not the only one that didn't immediately get that os._fwalk code.
Ethan Furman wrote:
Not the call itself, but the running of zip.
I wouldn't call my implementation "clever", but it differs from both of those options. We only need to check if we're strict when an error occurs in one of our iterators, which is a situation the C code for `zip` already needs to explicitly handle with a branch. So this condition is only hit on the "last" `__next__` call, not on every single iteration.
Ah, so this is why the strict version may consume an extra element -- it has to check if any remaining iterators have elements, while the non-strict version can just quit as soon as any of the iterators are exhausted. >>> one = iter([1, 2]) >>> six = iter([6, 7, 8]) >>> zip(one, six) (stuff) >>> next(six) 8 vs >>> zip_strict(one, six) (stuff) >>> next(six) (crickets)
I went ahead and ran some rough PGO/LTO benchmarks...
Can you do those with _pydecimal? If performance were an issue anywhere I would expect to see it with number crunching. --- Paul Moore and Chris Angelico have made good arguments in favor of an itertools addition which haven't been answered yet. Regardless, I think you've made the point that /a/ solution is very desirable. So the real debate is whether it should be a flag, a mode, or a separate function. I am still -1 on the flag. -- ~Ethan~