[Python-ideas] Batching/grouping function for itertools

Serhiy Storchaka storchaka at gmail.com
Sun Dec 8 18:32:40 CET 2013


08.12.13 14:16, Steven D'Aprano написав(ла):
> On Sun, Dec 08, 2013 at 01:30:56PM +0200, Serhiy Storchaka wrote:
>> 08.12.13 11:25, Steven D'Aprano написав(ла):
>>> In the second case, there is a question about what to do with sequences
>>> that are not a multiple of the window size. Similar to zip(), there are
>>> two things one might do:
>>>
>>> - pad with some given object;
>>> - raise an exception
>>
>> 3) emit last chunk incompleted;
>
> Given a window size of two, and input data [a, b, c], are you suggesting
> a variety that returns this?
>
> (a,b), (c,)
>
> There is no need for a separate function for that. Given a version that
> takes a pad object, if the pad argument is not given, return a partial
> chunk at the end.

You had proposed raise an exception when the pad argument is not given 
in previous message. In any case these are three different cases, and 
you can combine only two of them in one function using "absent argument" 
trick.

>> 4) skip incomplete chunk.
>
> The very next sentence in my post references that:
>
> "If you want to just ignore extra items, just catch the exception and
> continue."
>
> There is no need for an extra function covering that case.

You can't just use this generator in expression (e.g. as an argument to 
list). You need special wrapper which catches en exception. This is 
fourth variant. And if you need just this variant, why it is not in the 
stdlib?

>> There is also a question about result's type. Sometimes you need an
>> iterator of subsequences (i.e. split string on equal string chunks),
>> sometimes an iterator of iterators is enough.
>
> None of the other itertools functions treat strings specially. Why
> should this one?

Because I relatively often need this idiom and almost never need general 
function for iterators. I'm sure a function which splits sequences are 
enough in at least 90% cases when you need grouping function.

> If you want to re-join them into strings, you can do so
> with a trivial wrapper:
>
> (''.join(elements) for elements in group("some string", 3, pad=' '))
>
> ought to do the trick, assuming group returns tuples or lists of
> characters.

This is too slow and verbose and kill benefits of grouping function.



More information about the Python-ideas mailing list