[Python-ideas] Support floating-point values in collections.Counter
Joel Croteau
jcroteau at gmail.com
Tue Dec 19 22:09:07 EST 2017
Well here is some code I wrote recently to build a histogram over a
weighted graph, before becoming aware that Counter existed (score is a
float here):
from collections import defaultdict
total_score_by_depth = defaultdict(float)
total_items_by_depth = defaultdict(int)
num_nodes_by_score = defaultdict(int)
num_nodes_by_log_score = defaultdict(int)
num_edges_by_score = defaultdict(int)
for state in iter_graph_components():
try:
# There is probably some overlap here
ak = state['ak']
_, c = ak.score_paths(max_depth=15)
for edge in state['graph'].edges:
num_edges_by_score[np.ceil(20.0 * edge.score) / 20.0] += 1
for node in c.nodes:
total_score_by_depth[node.depth] += node.score
total_items_by_depth[node.depth] += 1
num_nodes_by_score[np.ceil(20.0 * node.score) / 20.0] += 1
num_nodes_by_log_score[np.ceil(-np.log10(node.score))] += 1
num_nodes_by_score[0.0] += len(state['graph'].nodes) - len(c.nodes)
num_nodes_by_log_score[100.0] += len(state['graph'].nodes) -
len(c.nodes)
except MemoryError:
print("Skipped massive.")
Without going too much into what this does, note that I could replace the
other defaultdicts with Counters, but I can't do the same thing with a
total_score_by_depth, at least not without violating the API. I would
suggest that with a name like Counter, treating a class like a Counter
should be the more common use case. If it's meant to be a multiset, we
should call it a Multiset. Here is an example from Stack Overflow of
someone else also wanting a float counter, and the only suggestion being to
use defaultdict:
https://stackoverflow.com/questions/10900207/any-way-to-tackle-float-counter-values-in-python
On Tue, Dec 19, 2017 at 3:08 AM Paul Moore <p.f.moore at gmail.com> wrote:
> On 18 December 2017 at 23:51, Joel Croteau <jcroteau at gmail.com> wrote:
> > It would be useful in many scenarios for values in collections.Counter
> to be
> > allowed to be floating point.
>
> Do you have any evidence of this? Code examples that would be
> significantly improved by such a change? I can't think of any myself.
>
> I might consider writing
>
> totals - defaultdict(float)
> for ...:
> totals[something] = calculation(something)
>
> but using a counter is neither noticeably easier, nor clearer...
>
> One way of demonstrating such a need would be if your proposed
> behaviour were available on PyPI and getting used a lot - I'm not
> aware of any such module if it is.
>
> Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171220/1a5ab50a/attachment.html>
More information about the Python-ideas
mailing list