Using enumerate to get line-numbers with itertools grouper?
Victor Hooi
victorhooi at gmail.com
Wed Sep 2 08:03:17 EDT 2015
Hi Peter,
Hmm, are you sure that will work?
The indexes returned by enumerate will start from zero.
Also, I've realised line_number is a bit of a misnomer here - it's actually the index for the chunks that grouper() is returning.
So say I had a 10-line textfile, and I was using a _BATCH_SIZE of 50.
If I do:
print(line_number * _BATCH_SIZE)
I'd just get (0 * 50) = 0 printed out 10 times.
Even if I add one:
print((line_number + 1) * _BATCH_SIZE)
I will just get 50 printed out 10 times.
My understanding is that the file handle f is being passed to grouper, which is then passing another iterable to enumerate - I'm just not sure of the best way to get the line numbers from the original iterable f, and pass this through the chain?
On Wednesday, 2 September 2015 20:37:01 UTC+10, Peter Otten wrote:
> Victor Hooi wrote:
>
> > I'm using grouper() to iterate over a textfile in groups of lines:
> >
> > def grouper(iterable, n, fillvalue=None):
> > "Collect data into fixed-length chunks or blocks"
> > # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
> > args = [iter(iterable)] * n
> > return zip_longest(fillvalue=fillvalue, *args)
> >
> > However, I'd also like to know the line-number that I'm up to, for
> > printing out in informational or error messages.
> >
> > Is there a way to use enumerate with grouper to achieve this?
> >
> > The below won't work, as enumerate will give me the index of the group,
> > rather than of the lines themselves:
> >
> > _BATCH_SIZE = 50
> >
> > with open(args.input_file, 'r') as f:
> > for line_number, chunk in enumerate(grouper(f, _BATCH_SIZE)):
> > print(line_number)
> >
> > I'm thinking I could do something to modify grouper, maybe, but I'm sure
> > there's an easier way?
>
> print(line_number * _BATCH_SIZE)
>
> Eureka ;)
More information about the Python-list
mailing list