<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Mar 14, 2017 at 2:38 AM, Marco Cognetta <span dir="ltr"><<a href="mailto:cognetta.marco@gmail.com" target="_blank">cognetta.marco@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">1) Addition of a Counter.least_common method:<br>This was addressed in <a href="https://bugs.python.org/issue16994" rel="noreferrer" target="_blank">https://bugs.python.org/<wbr>issue16994</a>, but it was<br>
never resolved and is still open (since Jan. 2013). This is a small<br>
change, but I think that it is useful to include in the stdlib.</blockquote><div><br></div><div>-1 on adding this. I read the issue, and do not find a convincing use case that is common enough to merit a new method. As some people noted in the issue, the "least common" is really the infinitely many keys not in the collection at all.</div><div><br></div><div>But I can imagine an occasional need to, e.g. "find outliers." However, that is not hard to spell as `mycounter.most_common()[-1*N:]`. Or if your program does this often, write a utility function `find_outliers(...)`</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">2) Undefined behavior when using Counter.most_common:<br>'c', 'c']), when calling c.most_common(3), there are more than 3 "most<br>
common" elements in c and c.most_common(3) will not always return the<br>
same list, since there is no defined total order on the elements in c.<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Should this be mentioned in the documentation?<br></blockquote><div><br></div><div>+1. I'd definitely support adding this point to the documentation.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Additionally, perhaps there is room for a method that produces all of<br>
the elements with the n highest frequencies in order of their<br>
frequencies. For example, in the case of c = Counter([1, 1, 1, 2, 2,<br>
3, 3, 4, 4, 5]) c.aforementioned_method(2) would return [(1, 3), (2,<br>
2), (3, 2), (4, 2)] since the two highest frequencies are 3 and 2.<br></blockquote><div><br></div><div>-0 on this. I can see wanting this, but I'm not sure often enough to add to the promise of the class. The utility function to do this would be somewhat less trivial to write than `find_outliers(..)` but not enormously hard. I think I'd be +0 on adding a recipe to the documentation for a utility function.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">3) Addition of a collections.Frequency or collections.Proportion class<br>
derived from collections.Counter:<br>
<br>
This is sort of discussed in <a href="https://bugs.python.org/issue25478" rel="noreferrer" target="_blank">https://bugs.python.org/<wbr>issue25478</a>.<br>
The idea behind this would be a dictionary that, instead of returning<br>
the integer frequency of an element, would return it's proportional<br>
representation in the iterable.</blockquote><br>One could write a subclass easily enough. The essential feature in my mind would be to keep an attributed Counter.total around to perform the normalization. I'm +1 on adding that to collections.Counter itself.<br><br>I'm not sure if this would be better as an attribute kept directly or as a property that called `sum(self.values())` when accessed. I believe that having `mycounter.total` would provide the right normalization in a clean API, and also expose easy access to other questions one would naturally ask (e.g. "How many observations were made?")<div> </div></div><br clear="all"><div><br></div>-- <br><div class="gmail_signature">Keeping medicines from the bloodstreams of the sick; food <br>from the bellies of the hungry; books from the hands of the <br>uneducated; technology from the underdeveloped; and putting <br>advocates of freedom in prisons. Intellectual property is<br>to the 21st century what the slave trade was to the 16th.<br></div>
</div></div>