[issue2607] why is (default)dict so slow on tuples???

Fri Apr 11 04:46:15 CEST 2008

Andreas Eisele <eisele at dfki.de> added the comment:

Sorry for not giving a good example in the first place.
The problem seems to appear only in the presence of
sufficiently many distinct tuples. Then I see performance
that looks rather like O(n*n)
Here is an example that shows the problem:

>>> from time import clock
>>> d = {}
>>> t0 = clock()
>>> for i in range(5):
 for j in range(i*1000000,(i+1)*1000000):
  d[str(j),str(j)]=j
 print clock()-t0

13.04
39.51
81.86
134.18
206.66
>>>

The same example with str(j)+str(j) works fine.

Sorry if this should be a non-issue. For me it is a
reason to implement functionality in C or Perl
that I would really love to do in Python.
I would call such a thing a performance bug, but
maybe I'm just too demanding...

Best regards,
Andreas

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2607>
__________________________________