Speed of string += string

Alex Martelli aleax at aleax.it
Sun Apr 13 18:15:59 CEST 2003

Martin v. Löwis wrote:

> Alex Martelli <aleax at aleax.it> writes:
>> The crucial difference is that Python dicts are hashtables, while the
>> std::map in the standard library of C++ can't be, because it's based
>> on an order relationship, NOT on hashing.  So, std::map has an order,
>> but its O(...) performance just can't be as good as a dict's.
> Nit-pickers (like myself) will point out that the complexity of a
> hash-table depends on the distribution of the hash function (in
> addition to depending on the complexity of the hash function).  With a

"To a nitpicker, a nitpicker squared": nothing stops you (in C++) from
writing an operator< whose complexity is just as high as you can have
for a __hash__ method in Python.  In this sense, dependency on the
complexity of some auxiliary function is just the same in the two
languages (actually, the dependency is stronger in C++, since for each
insertion or memebership test operator< will typically be called O(log N)
times, while, in Python, for such an operation hash is called O(1) time).

> carefully-chosen hash O(1) function, all dictionary operations will be
> O(n). For std::map, O(log n) is the worst-case performance.

Yes, it's technically true that (once some fairy godmother ensures
that the auxiliary function is O(1)) worst-case behavior is potentially
worse for a hash table than it is for a sufficently smart order-based
container (and the C++ standard library does constrain its container
to be sufficiently smart in this sense).  I think Tim's post has already
been quite eloquent on the relative care Python places on
worst-case-guaranteed vs typical-case-expected behavior.


More information about the Python-list mailing list