collections.Counter surprisingly slow

Joshua Landau joshua at
Mon Jul 29 13:49:53 CEST 2013

On 29 July 2013 07:25, Serhiy Storchaka <storchaka at> wrote:

> 28.07.13 22:59, Roy Smith написав(ла):
>    The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
>> unique strings).
> Repeat you tests with totally unique lines.

Counter is about ½ the speed of defaultdict in that case (as opposed to ⅓).

>  The full profiler dump is at the end of this message, but the gist of
>> it is:
> Profiler affects execution time. In particular it slowdown Counter
> implementation which uses more function calls. For real world measurement
> use different approach.

Doing some re-times, it seems that his originals for defaultdict, exception
and Counter were about right. I haven't timed the other.

>  Why is count() [i.e. collections.Counter] so slow?
> Feel free to contribute a patch which fixes this "wart". Note that Counter
> shouldn't be slowdowned on mostly unique data.

I find it hard to agree that counter should be optimised for the
unique-data case, as surely it's much more oft used when there's a point to

Also, couldn't Counter just extend from defaultdict?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Python-list mailing list