If you understand what iterators do, the fact that itertools.groupby
collects contiguous elements is both obvious and necessary. Iterators
might be infinitely long... you cannot ask for every "A" that might
eventually occur in an infinite sequence of letters.
On Sat, Jun 10, 2017 at 10:08 PM, Neal Fultz
Agreed to a degree about providing it as code, but it may also be worth mentioning also that zlib itself implements rle [1], and if there was ever a desire to go "python all the way down" you need an RLE somewhere anyway :)
That said, I'll be pretty happy with anything that replaces an hour of google/coding/testing/(hour later find out I'm an idiot from a random listserv) with 1 minute of googling. Again, my issue isn't that it was difficult to code, but it *was* hard to make the research-y jump from googling for "run length encoding python", where I knew *exactly* what algorithm I wanted, to "itertools.groupby" which appears to be more general purpose and needs a little tweaking. Adjusting the docs/recipes would probably solve that problem.
-- To me this is roughly on the same level as googling for 'binary search python' and not having bisect show up.
However, the fact that `itertools.groupby` doesn't group over elements that are not contiguous is a bit surprising to me coming from SQL/pandas/R land (that is probably a large part of my disconnect here). This is actually explicitly called out in the current docs, but I wonder how many people search for one thing and find the other:
I googled for RLE and the solution was actually groupby, but probably a lot of other people want a SQL group-by accidentally got an RLE and have to work around that... Then again, I don't know if you all can easily change names of functions at this point.
-Neal
[1] https://github.com/madler/zlib/blob/master/deflate.c#L2057
On Sat, Jun 10, 2017 at 9:39 PM, Greg Ewing
wrote: In my experience, RLE isn't something you often find on its own. Usually it's used as part of some compression scheme that also has ways of encoding verbatim runs of data and maybe other things.
So I'm skeptical that it can be usefully provided as a library function. It seems more like a design pattern than something you can capture in a library.
-- Greg
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.