Using enumerate to get line-numbers with itertools grouper?
victorhooi at gmail.com
Wed Sep 2 14:03:17 CEST 2015
Hmm, are you sure that will work?
The indexes returned by enumerate will start from zero.
Also, I've realised line_number is a bit of a misnomer here - it's actually the index for the chunks that grouper() is returning.
So say I had a 10-line textfile, and I was using a _BATCH_SIZE of 50.
If I do:
print(line_number * _BATCH_SIZE)
I'd just get (0 * 50) = 0 printed out 10 times.
Even if I add one:
print((line_number + 1) * _BATCH_SIZE)
I will just get 50 printed out 10 times.
My understanding is that the file handle f is being passed to grouper, which is then passing another iterable to enumerate - I'm just not sure of the best way to get the line numbers from the original iterable f, and pass this through the chain?
On Wednesday, 2 September 2015 20:37:01 UTC+10, Peter Otten wrote:
> Victor Hooi wrote:
> > I'm using grouper() to iterate over a textfile in groups of lines:
> > def grouper(iterable, n, fillvalue=None):
> > "Collect data into fixed-length chunks or blocks"
> > # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
> > args = [iter(iterable)] * n
> > return zip_longest(fillvalue=fillvalue, *args)
> > However, I'd also like to know the line-number that I'm up to, for
> > printing out in informational or error messages.
> > Is there a way to use enumerate with grouper to achieve this?
> > The below won't work, as enumerate will give me the index of the group,
> > rather than of the lines themselves:
> > _BATCH_SIZE = 50
> > with open(args.input_file, 'r') as f:
> > for line_number, chunk in enumerate(grouper(f, _BATCH_SIZE)):
> > print(line_number)
> > I'm thinking I could do something to modify grouper, maybe, but I'm sure
> > there's an easier way?
> print(line_number * _BATCH_SIZE)
> Eureka ;)
More information about the Python-list