set and dict iteration

Fri Aug 17 14:37:26 EDT 2012

On Thursday, August 16, 2012 9:24:44 PM UTC-5, Steven D'Aprano wrote:
> On Thu, 16 Aug 2012 19:11:19 -0400, Dave Angel wrote:
> 
> 
> 
> > On 08/16/2012 05:26 PM, Paul Rubin wrote:
> 
> >> Dave Angel <d at davea.name> writes:
> 
> >>> Everything else is implementation defined.  Why should an
> 
> >>> implementation be forced to have ANY extra data structure to detect a
> 
> >>> static bug in the caller's code?
> 
> >> For the same reason the interpreter checks for type errors at runtime
> 
> >> and raises TypeError, instead of letting the program go into the weeds.
> 
> > 
> 
> > There's an enormous difference between type errors, which affect the low
> 
> > level dispatch, and checking for whether a dict has changed and may have
> 
> > invalidated the iterator.  If we were really going to keep track of what
> 
> > iterators are tracking a given dict or set, why stop there?  Why not
> 
> > check if another process has changed a file we're iterating through?  Or
> 
> > ...
> 
> 
> 
> Which is why Python doesn't do it -- because it is (claimed to be) 
> 
> excessively expensive for the benefit that you would get.
> 
> 
> 
> Not because it is a matter of principle that data integrity is 
> 
> unimportant. Data integrity *is* important, but in the opinion of the 
> 
> people who wrote these particular data structures, the effort required to 
> 
> guarantee correct iteration in the face of mutation is too expensive for 
> 
> the benefit.
> 
> 
> 
> Are they right? I don't know. I know that the list sort method goes to a 
> 
> lot of trouble to prevent code from modifying lists while they are being 
> 
> sorted. During the sort, the list temporarily appears to be empty to 
> 
> anything which attempts to access it. So at least sometimes, the Python 
> 
> developers spend effort to ensure data integrity.
> 
> 
> 
> Luckily, Python is open source. If anyone thinks that sets and dicts 
> 
> should include more code protecting against mutation-during-iteration, 
> 
> they are more than welcome to come up with a patch. Don't forget unit and 
> 
> regression tests, and also a set of timing results which show that the 
> 
> slow-down isn't excessive.

I contribute a patch some time ago.  It wasn't accepted.  However this thread seems to show a moderately more favorable sentiment than that one.

Is there a problem with hacking on the Beta?  Or should I wait for the Release?  Does anyone want to help me with the changes?  Perhaps P. Rubin could contribute the variation he suggested as well.