[Python-ideas] Run length encoding

David Mertz mertz at gnosis.cx
Sun Jun 11 01:19:00 EDT 2017


If you understand what iterators do, the fact that itertools.groupby
collects contiguous elements is both obvious and necessary.  Iterators
might be infinitely long... you cannot ask for every "A" that might
eventually occur in an infinite sequence of letters.

On Sat, Jun 10, 2017 at 10:08 PM, Neal Fultz <nfultz at gmail.com> wrote:

> Agreed to a degree about providing it as code, but it may also be worth
> mentioning also that zlib itself implements rle [1], and if there was ever
> a desire to  go "python all the way down" you need an RLE somewhere anyway
> :)
>
> That said, I'll be pretty happy with anything that replaces an hour of
> google/coding/testing/(hour later find out I'm an idiot from a random
> listserv) with 1 minute of googling.  Again, my issue isn't that it was
> difficult to code, but it *was* hard to make the research-y jump from
> googling for "run length encoding python", where I knew *exactly* what
> algorithm I wanted, to  "itertools.groupby" which appears to be more
> general purpose and needs a little tweaking.  Adjusting the docs/recipes
> would probably solve that problem.
>
>  -- To me this is roughly on the same level as googling for 'binary search
> python' and not having bisect show up.
>
> However, the fact that  `itertools.groupby` doesn't group over elements
> that are not contiguous is a bit surprising to me coming from SQL/pandas/R
> land (that is probably a large part of my disconnect here). This is
> actually explicitly called out in the current docs, but I wonder how many
> people search for one thing and find the other:
>
>  I googled for RLE and the solution was actually groupby, but probably a
> lot of other people want a SQL group-by accidentally got an RLE and have to
> work around that... Then again, I don't know if you all can easily change
> names of functions at this point.
>
> -Neal
>
> [1] https://github.com/madler/zlib/blob/master/deflate.c#L2057
>
>
>
> On Sat, Jun 10, 2017 at 9:39 PM, Greg Ewing <greg.ewing at canterbury.ac.nz>
> wrote:
>
>> In my experience, RLE isn't something you often find on its own.
>> Usually it's used as part of some compression scheme that also
>> has ways of encoding verbatim runs of data and maybe other
>> things.
>>
>> So I'm skeptical that it can be usefully provided as a library
>> function. It seems more like a design pattern than something
>> you can capture in a library.
>>
>> --
>> Greg
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170610/d2fa540d/attachment-0001.html>


More information about the Python-ideas mailing list