Weak dict iterators are fragile

Hello, In py3k, the weak dict methods keys(), values() and items() have been changed to return iterators (they returned lists in 2.x). However, it turns out that it makes these methods quite fragile, because a GC collection can occur whenever during iterating, destroy one of the weakref'ed objects, and trigger a resizing of the underlying dict, which in turn raises an exception ("RuntimeError: dictionary changed size during iteration"). This has just triggered (assuming the diagnosis is correct) a hang in test_multiprocessing: http://bugs.python.org/issue7060 I would like to propose some possible solutions against this: 1. Add the safe methods listkeys(), listitems(), listvalues() which would behave as the keys(), etc. methods from 2.x 2. Make it so that keys(), items(), values() atomically build a list of items internally, which makes them more costly for large weak dicts, but robust. What do you think? Regards Antoine.

On Sun, Oct 11, 2009 at 10:50 AM, Antoine Pitrou <solipsis@pitrou.net>wrote:
In py3k, the weak dict methods keys(), values() and items() have been changed to return iterators (they returned lists in 2.x). However, it turns out that it makes these methods quite fragile, because a GC collection can occur whenever during iterating, destroy one of the weakref'ed objects, and trigger a resizing of the underlying dict, which in turn raises an exception ("RuntimeError: dictionary changed size during iteration").
Ouch! The iterator from __iter__ is also affected. 1. Add the safe methods listkeys(), listitems(), listvalues() which would
behave as the keys(), etc. methods from 2.x
2. Make it so that keys(), items(), values() atomically build a list of items internally, which makes them more costly for large weak dicts, but robust.
-1 on 1. +0 on 2. It'd be nice if we could postpone the resize if there are active iterators, but I don't think there's a clean way to track the iterators. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

Daniel Stutzbach <daniel <at> stutzbachenterprises.com> writes:
-1 on 1.+0 on 2.It'd be nice if we could postpone the resize if there are
active iterators, but I don't think there's a clean way to track the iterators. I've started experimenting, and it seems reasonably possible using a simple guard class and a set of weakrefs. Regards Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
Daniel Stutzbach <daniel <at> stutzbachenterprises.com> writes:
-1 on 1.+0 on 2.It'd be nice if we could postpone the resize if there are
active iterators, but I don't think there's a clean way to track the iterators.
I've started experimenting, and it seems reasonably possible using a simple guard class and a set of weakrefs.
Open issue and proposed patch at http://bugs.python.org/issue7105 Antoine.

Antoine Pitrou <solipsis <at> pitrou.net> writes:
1. Add the safe methods listkeys(), listitems(), listvalues() which would behave as the keys(), etc. methods from 2.x
2. Make it so that keys(), items(), values() atomically build a list of items internally, which makes them more costly for large weak dicts, but robust.
And a third one (a bit more complicated implementation-wise): 3. Delay weak dict removals until any iteration has finished. Regards Antoine.
participants (2)
-
Antoine Pitrou
-
Daniel Stutzbach