Proper deletion of selected items during map iteration in for loop: Thanks to all
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sat Apr 26 22:14:02 EDT 2014
On Sat, 26 Apr 2014 12:25:27 -0700, Charles Hixson wrote:
> On 04/25/2014 10:53 AM, Charles Hixson wrote:
>> What is the proper way to delete selected items during iteration of a
>> map? What I want to do is:
>>
>> for (k, v) in m.items():
>> if f(k):
>> # do some processing of v and save result elsewhere
>> del m[k]
>>
>> But this gives (as should be expected):
>> RuntimeError: dictionary changed size during iteration
>> In the past I've accumulated the keys to be deleted in a separate list,
>> but this time there are likely to be a large number of them, so is
>> there some better way?
>>
>>
> Going over the various responses, it looks like saving the "to be
> deleted" keys to a list, and then iterating over that to delete is the
> best answer.
I don't think that there is any one "best" answer that always applies.
"My dict has three items, and I'll be deleting most of them" is likely to
have different performance characteristics from "My dict has three
billion items, and I'll be deleting two [or two billion] of them".
So how much time do you want to spend tuning this for optimum performance
(or memory use, or programmer time)? Or is "good enough" good enough?
I think the two obviously good enough approaches are:
- save a "to be deleted" list, then delete those keys;
- copy the "not to be deleted" items into a new dict
My intuition is that the second will probably be faster, unless your dict
is truly monstrously big. Not millions, but tens or hundreds or thousands
of millions of items, depending on how much memory your computer has.
You've already got a code snippet using the "to be deleted" list, here's
how I would do the other way:
new = {}
for k, v in m.items():
if f(k):
process(v)
else:
new[k] = v
m.clear()
m.update(new)
del new
If you don't care about modifying the existing dict, but can afford to
write in a more functional style, you don't even need to bother doing
that m.clear(), m.update(new). Just return the new dict, stop using the
old one (it will be garbage collected), and use the new one.
Oh, in case you're wondering, this will be more efficient than it may
seem, because the actual data in the dict isn't copied. The only things
being copied are references to the keys and values, not the keys and
values themselves.
--
Steven D'Aprano
http://import-that.dreamwidth.org/
More information about the Python-list
mailing list