Sorted dictionary

Jan Kaliszewski zuo at chopin.edu.pl
Thu Jan 21 13:43:45 EST 2010


Dnia 21-01-2010 o 08:49:22 Chris Rebert <clp2 at rebertia.com> napisał(a):

> On Wed, Jan 20, 2010 at 5:50 PM, Steven D'Aprano
> <steven at remove.this.cybersource.com.au> wrote:
>> On Thu, 21 Jan 2010 02:02:02 +0100, Jan Kaliszewski wrote:

>>> http://code.activestate.com/recipes/576998/

>> What's the advantage of that over sorting the keys as needed?
>>
>>
>> E.g.
>>
>> for key in sorted(dict):
>>    print key
>>
>>
>> works.
>
> Well, it does spread out the cost of sorting the key list over each of
> the N distinct key assignments, versus sorting the entire list all at
> once, on-demand at the end of the process. And you avoid having to
> traverse the M buckets in the dictionary to locate the N keys in the
> first place. So it has different performance characteristics, which
> could theoretically matter depending on the use case; e.g. it could
> avoid a GUI application hanging for several seconds while it sorts a
> large list of keys.

Generally, I see 3 possible approaches:

1. What Steven wrote about: simply sort all keys when you need to get them  
sorted (it needs iteration over whole a dict).  In many cases it's good  
enough (so is the best, because of simplicity) -- e.g. when a dict is not  
very long (very common case), or when that sorting it is a rare operation.  
Sort always when a key is added/deleted, maintaining an auxiliary list of  
sorted keys (sufficient especially when you modify a dict rarely and get  
 from it often).

2. Sort all keys on-demand -- but only if a dict is marked as "modified".  
Mark a dict as "modified" when you add/delete any key; and mark a dict as  
"unmodified" when sorting. (Of course, it can be automatized, see e.g.:  
http://pypi.python.org/pypi/sorteddict). Nice, if your use-pattern is:  
add/del a lot, *then* only get/iter a lot...

3. Continually (at each set/del of a key) maintain auxiliary sorted list  
of keys -- using binary search (as I did), heap queue (see:  
http://pypi.python.org/pypi/HeapDict) or such an algorithm. I think this  
approach is the best when you have quite long dict, and you add/del fairly  
often and get/iter very often.

Regards,
*j

-- 
Jan Kaliszewski (zuo) <zuo at chopin.edu.pl>



More information about the Python-list mailing list