
On 12/8/2013 7:16 AM, Steven D'Aprano wrote:
On Sun, Dec 08, 2013 at 01:30:56PM +0200, Serhiy Storchaka wrote:
There is also a question about result's type. Sometimes you need an iterator of subsequences (i.e. split string on equal string chunks), sometimes an iterator of iterators is enough.
None of the other itertools functions treat strings specially. Why should this one? If you want to re-join them into strings, you can do so with a trivial wrapper:
(''.join(elements) for elements in group("some string", 3, pad=' '))
A large fraction, perhaps over half, of the multiple requests for a chunker or grouper function are for sequences, not general iterables, as input, with the desired output type being the input type. For this, an iterator of *slices* is *far* more efficient. The same function could easily handle overlaps. (There are still the possible varieties of short slice handling). *Untested*: def window(seq, size, advance=None, extra='skip'): '''Yield successive slices of len size of sequence seq. Move window advance items (default = size). Extra determines the handling of extra items. The options are 'skip' (default), 'keep', and 'raise'. ''' if overlap == None: advance = size i, j, n = 0, size, len(seq) while j <= n: yield seq[i:j] i += advance j += advance if j < n + advance: if extra == 'keep': yield seq[i:j] elif extra == 'raise' raise ValueError('extra items') else: raise ValueError('bad extra') Having gotten this far, it would be possible to treat the above as a fast path for sequences and wrap it in try:except and if len or slice fail, fall back to a general iterator version. The result could be a builtin rather than itertool. -- Terry Jan Reedy