[Python-ideas] Looking for a "batch" function
Tal Einat
taleinat at gmail.com
Sat Jul 17 22:30:30 CEST 2010
On Sat, Jul 17, 2010 at 9:50 PM, Shane Hathaway wrote:
> On 07/17/2010 02:52 AM, Chris Rebert wrote:
>>
>> See the "grouper" recipe in itertools:
>> http://docs.python.org/library/itertools.html#recipes
>> It does almost exactly what you want:
>> grouper(3, 'ABCDEFG', 'x') --> ['A','B','C'], ['D','E','F'],
>> ['G','x','x']
>
> Interesting, but I have a few concerns with that answer:
>
> - It ignores the type of the container. If I provide a string as input, I
> expect an iterable of strings as output.
>
> - If I give a batch size of 1000000, grouper() is going to be rather
> inefficient. Even worse would be to allow users to specify the batch size.
>
> - Since grouper() is not actually in the standard library and it doesn't do
> quite what I need, it's rather unlikely that I'll use it.
>
> Another possible name for this functionality I am describing is packetize().
> Computers always packetize data for transmission, storage, and display to
> users. Packetizing seems like such a common need that I think it should be
> built in to Python.
This reminds me of discussions about a "flatten" function.
This kind of operation often has slightly different requirements in
different scenarios. It is very simple to implement a version of this
to meet your exact needs. Sometimes in these kinds of situations it is
better not to have a built-in generic function, to force programmers
to decide explicitly how they want it to work.
You mentioned efficiency; to do this kind of operation efficiently
ones really needs to know what kind of sequence/iterator is being
"packetized", and implement accordingly.
- Tal Einat
More information about the Python-ideas
mailing list