collections.Counter surprisingly slow

Serhiy Storchaka storchaka at gmail.com
Mon Jul 29 08:25:49 CEST 2013


28.07.13 22:59, Roy Smith написав(ла):
>   The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
> unique strings).

Repeat you tests with totally unique lines.

> The full profiler dump is at the end of this message, but the gist of
> it is:

Profiler affects execution time. In particular it slowdown Counter 
implementation which uses more function calls. For real world 
measurement use different approach.

> Why is count() [i.e. collections.Counter] so slow?

Feel free to contribute a patch which fixes this "wart". Note that 
Counter shouldn't be slowdowned on mostly unique data.





More information about the Python-list mailing list