Sorted and reversed on huge dict ?

Fredrik Lundh fredrik at
Fri Nov 3 20:25:08 CET 2006

vd12005 at wrote:

> i would like to sort(ed) and reverse(d) the result of many huge
> dictionaries (a single dictionary will contain ~ 150000 entries). Keys
> are words, values are count (integer).

not sure 150k entries qualify as huge, though, unless you're short on 

> i'm wondering if i can have a 10s of these in memory, or if i should
> proceed one after the other.

why not just try it out?

> but moreover i'm interested in saving theses as values, keys sorted and
> reversed (ie most frequent first), i can do it with sort from unix
> command but i wonder how i should do it with python to be memory
> friendly.
> can it be done by using :
> from itertools import izip
> pairs = izip(d.itervalues(), d.iterkeys())
> for v, k in reversed(sorted(pairs)):
>     print k, v
> or will it be the same as building the whole list ?

sorted() needs access to all data, so it'll build a list, even if you 
feed it a generator.  you will have to test it yourself, but I suspect 
that the most memory-efficient way to do what you want could be:

     items = d.items()
     items.sort(key=operator.itemgetter(1), reverse=True)

the items list would require a couple of megabytes for 150k dictionary 
entries, or so.  the key map needs some memory too, but the rest of the 
sort is done in place.


More information about the Python-list mailing list