set iteration order
Hi, I just hunted down a change in behaviour between Python 3.1 and 3.2 to possibly changed iteration order of sets due to the optimization in issue #8685. Of course, this order shouldn't be relied on in the first place, but the side effect of the optimization might be worth mentioning in "What's new", maybe also pointing out that the old behaviour can be simulated with {x for x in a if x not in b} in place of "a-b". Cheers, Hagen
On 2/26/2011 4:09 AM, Hagen Fürstenau wrote:
Hi,
I just hunted down a change in behaviour between Python 3.1 and 3.2 to possibly changed iteration order of sets due to the optimization in issue #8685. Of course, this order shouldn't be relied on in the first place, but the side effect of the optimization might be worth mentioning in "What's new", maybe also pointing out that the old behaviour can be simulated with {x for x in a if x not in b} in place of "a-b".
-1 Code with any dependence on the iteration order of unordered collections (other than the guarantee that d.keys() and d.values() match at any given time as long as d is unchanged) is buggy. It is not the place of What's new to cater to buggy code. Besides which, there is no guarantee that the 'x in a' part of the suggestion will will remain the same from version to version or even from run to run. -- Terry Jan Reedy
Code with any dependence on the iteration order of unordered collections (other than the guarantee that d.keys() and d.values() match at any given time as long as d is unchanged) is buggy.
It's not a matter of dependence on iteration order, but of reproducibility (in my case there were minor numerical differences due to different iteration orders). I think we also warn about changes in pseudorandom number sequences, although you could argue that no code should depend on specific pseudorandom numbers. Cheers, Hagen
It's not a matter of dependence on iteration order, but of reproducibility (in my case there were minor numerical differences due to different iteration orders).
Can you give a code example? I don’t understand your case.
It's a bit involved (that's why it took me a while to locate the difference in behavior), but it boils down to a (learning) algorithm that in principle should not care about order of input data, but will in practice show slightly different numerical behavior. I ran into the problem when trying to exactly reproduce previously published experimental results. Of course, I should have anticipated this and fixed some arbitrary order in the first place. I just thought a note about this change might save someone in a similar situation some confusion. Cheers, Hagen
Hagen Fürstenau wrote:
Code with any dependence on the iteration order of unordered collections (other than the guarantee that d.keys() and d.values() match at any given time as long as d is unchanged) is buggy.
It's not a matter of dependence on iteration order, but of reproducibility (in my case there were minor numerical differences due to different iteration orders).
If those differences are insignificant to you, then why do you care? If they are significant enough that (say) tests were failing, then your results depend on the iteration order of a set, and your code is buggy and should be fixed. Or perhaps your tests are too strict.
I think we also warn about changes in pseudorandom number sequences, although you could argue that no code should depend on specific pseudorandom numbers.
The random module provides an API for repeating sequences of pseudorandom numbers: the seed. So you *can* depend on specific numbers, if you need to. Sets and dicts do not provide any such API. The order even changes with the history of the object: two equal sets can have different iteration orders. Personally, I don't care whether or not we mention that set iteration order has changed. It seems too trivial to worry much about it. +0 -- Steven
On Sat, 26 Feb 2011 10:09:33 +0100
Hagen Fürstenau
I just hunted down a change in behaviour between Python 3.1 and 3.2 to possibly changed iteration order of sets due to the optimization in issue #8685. Of course, this order shouldn't be relied on in the first place, but the side effect of the optimization might be worth mentioning in "What's new", maybe also pointing out that the old behaviour can be simulated with {x for x in a if x not in b} in place of "a-b".
I'm against such a mention. It would give the impression that we support some semblance of reproduceability in iteration order, which is not true. If you use sets or dicts, you must deal with the fact that the iteration order will be totally random from your (the programmer's) POV. Regards Antoine.
On Feb 26, 2011, at 4:09 PM, Antoine Pitrou wrote:
On Sat, 26 Feb 2011 10:09:33 +0100 Hagen Fürstenau
wrote: I just hunted down a change in behaviour between Python 3.1 and 3.2 to possibly changed iteration order of sets due to the optimization in issue #8685. Of course, this order shouldn't be relied on in the first place, but the side effect of the optimization might be worth mentioning in "What's new", maybe also pointing out that the old behaviour can be simulated with {x for x in a if x not in b} in place of "a-b".
I'm against such a mention. It would give the impression that we support some semblance of reproduceability in iteration order, which is not true. If you use sets or dicts, you must deal with the fact that the iteration order will be totally random from your (the programmer's) POV.
I concur with Antoine. Also, it wasn't the iteration order that changed; sets still iterate in the same order. What changed was the algorithm for creating a new set using a set difference operation. Raymond
participants (6)
-
Antoine Pitrou
-
Hagen Fürstenau
-
Raymond Hettinger
-
Steven D'Aprano
-
Terry Reedy
-
Éric Araujo