On 12/15/19 8:25 PM, Raymond Hettinger wrote:
On Dec 15, 2019, at 6:48 PM, Larry Hastings <larry@hastings.org> wrote:
As of 3.7, dict objects are guaranteed to maintain insertion order.  But set objects make no such guarantee, and AFAIK in practice they don't maintain insertion order either.  Should they?
I don't think they should.  

Several thoughts:

* The corresponding mathematical concept is unordered and it would be weird to impose such as order.

I disagree--I assert it's no more or less weird than imposing order on dicts, which we eventually decided to do.


* You can already get membership testing while retaining insertion ordering by running dict.fromkeys(seq).

I'm not sure where you're going with this.  The argument "you can do this with other objects" didn't preclude us from, for example, adding ordering to the dict object.  We already had collections.OrderedDict, and we decided to add the functionality to the dict object.


* Set operations have optimizations that preclude giving a guaranteed order (for example, set intersection loops over the smaller of the two input sets).

* To implement ordering, set objects would have to give-up their current membership testing optimization that exploits cache locality in lookups (it looks at several consecutive hashes at a time before jumping to the next random position in the table).

* The ordering we have for dicts uses a hash table that indexes into a sequence.  That works reasonably well for typical dict operations but is unsuitable for set operations where some common use cases make interspersed additions and deletions (that is why the LRU cache still uses a cheaply updated doubly-linked list rather that deleting and reinserting dict entries).

These are all performance concerns.  As I mentioned previously in this thread, in my opinion we should figure out what semantics we want for the object, then implement those, and only after that should we worry about performance.  I think we should decide the question "should set objects maintain insertion order?" literally without any consideration about performance implications.


* This idea has been discussed a couple times before and we've decided not to go down this path.  I should document prominently because it is inevitable that it will be suggested periodically because it is such an obvious thing to consider.

Please do!

In the email quoted by Tim Peters earlier in the thread, you stated that set objects "lose their flexibility" if they maintain order.  Can you be more specific about what you meant by that?


/arry

p.s. To be clear, I'm not volunteering to add this feature to the set object implementation--that's above my pay grade.  But I'm guessing that, if we decided we wanted these semantics for the dict object, someone would volunteer to implement it.