
On Thu, Apr 23, 2020 at 09:10:16PM -0400, Nathan Schneider wrote:
How, for example, to collate lines from 3 potentially large files while ensuring they match in length (without an external dependency)? The best I can think of is rather ugly:
with open('a.txt') as a, open('b.txt') as b, open('c.txt') as c: for lineA, lineB, lineC in zip(a, b, c): do_something_with(lineA, lineB, lineC) assert next(a, None) is None assert next(b, None) is None assert next(c, None) is None
Changing the zip() call to zip(aF, bF, cF, strict=True) would remove the necessity of the asserts.
I think that the "correct" (simplest, easiest, most obvious, most flexible) way is: with open('a.txt') as a, open('b.txt') as b, open('c.txt') as c: for lineA, lineB, lineC in zip_longest(a, b, c, fillvalue=''): do_something_with(lineA, lineB, lineC) and have `do_something_with` handle the empty string case, either by raising, or more likely, doing something sensible like treating it as a blank line rather than dying with an exception. Especially if the files differ in how many newlines they end with. E.g. file a.txt and c.txt end with a newline, but b.txt ends without one, or ends with an extra blank line at the end. File handling code ought to be resilient in the face of such meaningless differences, but zip_strict encourages us to be the opposite of resilient: an extra newline at the end of the file will kill the application with an unnecessary exception. An alternate way to handle it: for t in zip_longest(a, b, c, fillvalue=''): if '' in t: process() # raise if you insist else: do_something_with(*t) So my argument is that anything you want zip_strict for is better handled with zip_longest -- including the case of just raising. -- Steven