Sorted and reversed on huge dict ?
Fredrik Lundh
fredrik at pythonware.com
Fri Nov 3 14:25:08 EST 2006
vd12005 at yahoo.fr wrote:
> i would like to sort(ed) and reverse(d) the result of many huge
> dictionaries (a single dictionary will contain ~ 150000 entries). Keys
> are words, values are count (integer).
not sure 150k entries qualify as huge, though, unless you're short on
memory.
> i'm wondering if i can have a 10s of these in memory, or if i should
> proceed one after the other.
why not just try it out?
> but moreover i'm interested in saving theses as values, keys sorted and
> reversed (ie most frequent first), i can do it with sort from unix
> command but i wonder how i should do it with python to be memory
> friendly.
>
> can it be done by using :
>
> from itertools import izip
> pairs = izip(d.itervalues(), d.iterkeys())
> for v, k in reversed(sorted(pairs)):
> print k, v
>
> or will it be the same as building the whole list ?
sorted() needs access to all data, so it'll build a list, even if you
feed it a generator. you will have to test it yourself, but I suspect
that the most memory-efficient way to do what you want could be:
items = d.items()
items.sort(key=operator.itemgetter(1), reverse=True)
the items list would require a couple of megabytes for 150k dictionary
entries, or so. the key map needs some memory too, but the rest of the
sort is done in place.
</F>
More information about the Python-list
mailing list