On 19/06/17 02:47, David Mertz wrote:
As an only semi-joke, I have created a module on GH that meets the needs of this discussion (using the spelling I think are most elegant):
It's a shame you have to build that list when encoding. I tried to work out a way to get the number of items in an iterable without having to capture all the values (on the understanding that if the iterable is already an iterator, it would be consumed).
The best I came up with so far (not general purpose, but it works in this scenario) is:
from iterator import groupby from operator import countOf
def rle_encode(it): return ((k, countOf(g, k)) for k, g in groupby(it))
In your test code, this speeds things up quite a bit over building the list, but that's presumably only because both groupby() and countOf() will use the standard class comparison operator methods which in the case of ints will short-circuit with a C-level pointer comparison first.
For user-defined classes with complicated comparison methods, getting the length of the group by comparing the items will probably be worse.
Is there a better way of implementing a general-purpose "ilen()"? I tried a couple of other things, but they all required at least one lambda function and slowed things down by about 50% compared to the list-building version.
(I agree this is sort of a joke, but it's still an interesting puzzle ...).