Sorted and reversed on huge dict ?

Paul Rubin http
Fri Nov 3 14:21:51 EST 2006


vd12005 at yahoo.fr writes:
> i would like to sort(ed) and reverse(d) the result of many huge
> dictionaries (a single dictionary will contain ~ 150000 entries). Keys
> are words, values are count (integer).
> 
> i'm wondering if i can have a 10s of these in memory, 

Depends on how much memory your machine has.

> or if i should proceed one after the other.

Obviously that's more memory friendly if you can do it that way.

> from itertools import izip
> pairs = izip(d.itervalues(), d.iterkeys())
> for v, k in reversed(sorted(pairs)):
>     print k, v
> 
> or will it be the same as building the whole list ?

I think the above is pretty good.  sorted necessarily builds and
returns a list, but itervalues/iterkeys, izip, and reversed, just
build small iterator objects.

If the lists are really large, your next step after the above is
probably to use an external sort, but 150000 entries is not that many,
and anyway if sorting is a strain on available memory, then having the
dicts in memory at all will probably also be a strain.  Maybe you
should start looking into storing the dict contents externally, such
as in a database.



More information about the Python-list mailing list