iterating through humongously large key lists!!
Tim Peters
tim.one at home.com
Wed Apr 25 21:16:50 EDT 2001
[Mark blobby Robinson]
> I needed a non-destructive method so popitem was not a viable option
> ...
[Alex Martelli]
> In many cases it should not be a problem to 'restore' into _another_
> dictionary. Or do you have many references to the starting dict
> which it would be a problem to update...? Then you can do the
> same trick again -- sure, it's gonna be slower, but...
>
> The general idea would be:
>
> newdic = {}
> while len(oldict):
> key, value = oldict.popitem()
> process(key, value)
> newdic[key] = value
>
> and now in newdic you have all you earlier had in oldict, and may
> pour it back if need be. Of course, not thread-safe, not fastest, &c,
> but still it may be useful.
Well, if he was concerned about the memory burden of materializing the full
list of keys before, *this* one is gonna kill him: the memory consumed by
oldict does not decrease during this loop, so in the end you have two dicts
of the original size, and dicts consume about 4-5x the memory of a list with
the same # of entries.
BTW, I'm describing an accident of the implementation here, not an inherent
feature of dicts, so this may change: under all implementations of Python
dicts to date, a dict never resizes except possibly in response to
__setitem__. You can *delete* things from a dict until the sun goes nova (or
Win95 crashes, whichever comes first <wink>), but the dict won't shrink
internally before you starting adding keys to it again. This is partly
intentional, (a) to avoid pointless thrashing for ping-pong dicts, and (b) to
minimize the number of cycles burned checking *whether* to resize.
A curious fact under 2.0 and earlier is that a dict can resize even when
merely replacing the value for a key that already exists. We stopped that in
2.1, though.
pragmatically y'rs - tim
More information about the Python-list
mailing list