On 11 June 2017 at 13:27, Joshua Morton email@example.com wrote:
David: You're absolutely right, s/2/3 in my prior post!
Neal: As for why zip (at first I thought you meant the zip function, not the zip compression scheme) is included and rle is not, zip is (or was), I believe, used as part of python's packaging infrastructure, hopefully someone else can correct me if that's untrue.
There are a variety of reasons why things end up in the standard library:
"zip" and the other compression libraries check a couple of those boxes: they're broadly useful and they're needed in other parts of the standard library (e.g. lots of network protocols include compression support, we support importing from zip archives, and we support generating them through distutils, shutil, and zipapp)
Run-length-encoding on the other hand is one of those things where the algorithm is pretty simple, and you're almost always going to be able to get the best results by creating an implement tailored specifically to your use case, rather than working through a general purpose abstraction like the iterator protocol. Even when that isn't the case, implementing your own utility function is still going to be competitive time-wise with finding and using a standard implementation.
I suspect the main long term value of offering a recommended implementation as an itertools recipe wouldn't be in using it directly, but rather in making it easier for people to:
The one itertools building block I do sometimes wish we had was a
counted_len helper that:
len(obj)if the given object implemented
sum(1 for __ in obj)otherwise
Otherwise you have to make the choice between:
len(obj), and only support sequences
len(list(obj))and potentially make a pointless copy
sum(1 for __ in obj)and ignore the possible O(1) fast path
writing your own
try: return len(iterable) except TypeError: pass return sum(1 for __ in iter(iterable))
If there was an itertools.counted_len function then the obvious option would be "use itertools.counted_len". Such a function would also become the obvious way to consume an iterator when you don't care about the individual results - you'd just process them all, and get the count of how many iterations happened.
Given such a helper, the recipe for run-length-encoding would then be:
def run_length_encode(iterable): return ((k, counted_len(g)) for k, g in groupby(iterable))
-- Nick Coghlan | firstname.lastname@example.org | Brisbane, Australia