Auto sort in Dictionaries???

Steve Horne sh at ttsoftware.co.uk
Mon Jul 23 05:20:28 EDT 2001


On 22 Jul 2001 23:35:19 -0700, phillip at transwitch.co.za (phillip)
wrote:

>Hi,
>
>I put 'name,values' into a dictionary.
>But when I get the keys and display the values they come out in a
>different order that the order I inserted them in.
>
>Is there a way maintain the oringinal order?
>
>Phill

Not really

Internally, Python uses hashing to implement dictionaries. What this
means is that python objects - which may represent large values, such
as long strings or nested tuples - are translated to a representative
simple value to use as a subscript into an array. The advantage of
doing this is that it is a very efficient (and with a few catches)
very reliable way of handling dictionaries. Unfortunately, it is very
difficult to directly read the keys in order - the representative
simple value will not relate in any sensible way to the ordering of
the values. This is deliberate in hashing - if the keys you use for
your values are clustered around a few common values (which quite
often happens in real applications), you still want them to be
distributed evenly in the hash table or things start getting very
awkward very quickly.

You can get the keys in order if you want - just use the sort
function. This is a little annoying, but given that Python would have
to do the same thing to return the keys in order - inefficient for
applications where that is not needed - it is better that it continues
to work the way it does.

Of course, hashing isn't the *only* way to implement dictionaries. It
is much better than the red-black trees (a tweak of binary trees) used
in the C++ STL for std::map etc - it avoids doing lots of slow string
comparisons for insertions and searches - but it could be argued that
ternary trees are better than hashing. Ternary trees can detect items
not found in the dictionary much faster than hashing, perform
comparably (maybe a little slower) for insertions, deletions and
searches that do find an item, and they *do* support in-order
iteration of the keys. Whether ternary trees use more memory than hash
tables depends very much on how the hashing is implemented.

Hashing is popular mainly for the same reason I treat it as suspicious
- effective hashing is more a matter of experience, judgement and
sometimes luck than of deterministic logic, which of course makes it
interesting to play with and write papers about.

Pythons hashing works well, though, so changing it for an alternative
that - in truth - has its downsides as well as benefits would be a bad
idea.

-- 
Steve Horne
Home : steve at lurking.demon.co.uk
Work : sh at ttsoftware.co.uk



More information about the Python-list mailing list