[New-bugs-announce] [issue6370] Bad performance of colllections.Counter at initialisation from an iterable
SilentGhost
report at bugs.python.org
Mon Jun 29 16:56:28 CEST 2009
New submission from SilentGhost <michael.mischurow+bpo at gmail.com>:
I'm comparing initialisation of Counter from an iterable with the
following function:
def unique(seq):
"""Dict of unique values (keys) & their counts in original sequence"""
out_dict = dict.fromkeys(set(seq), 0)
for i in seq:
out_dict[i] += 1
return out_dict
iterable = list(range(43)) + list(range(43, 0, -1))
The timeit-obtained values show that it takes Counter four (4) times
longer to finish. As it's obvious from comparing my function and lines
429-430 of collections.py the only difference is preallocating the final
dictionary. When line 430 of collections is replaced with:
self[elem] = self.get(elem, 0) + 1
I was able to get about 25% time-performance increase (I assume
__missing__ is bypassed). I hope that it's possible to improve its
implementation even further.
----------
components: Library (Lib)
messages: 89846
nosy: SilentGhost
severity: normal
status: open
title: Bad performance of colllections.Counter at initialisation from an iterable
type: performance
versions: Python 3.1
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6370>
_______________________________________
More information about the New-bugs-announce
mailing list