
On Tue, Nov 29, 2022 at 02:07:34AM +0000, Oscar Benjamin wrote:
Let's split this into two separate questions:
Let's not. Your first question about non-deterministic set order being "innately good" is a straw man: as we've already discussed, set order is not non-deterministic (except in the informal sense of "hard to predict") and I don't think anyone is arguing in favour of keeping set order unpredictable even if there are faster, more compact, simpler implementations which preserve order. Talking about determinism is just muddying the waters. Sets are deterministic: their order is determinied by the implementation, the set history, and potentially any environmental sources of entropy used in address randomisation. Deterministic does not mean predictable. If we want to get this discussion onto a more useful path, we should start with a security question: The hash of None changes because of address randomisation. Address randomisation is enabled as a security measure. Are there any security consequences of giving None a constant hash? I can't see any, but then I couldn't see the security consequences of predictable string hashes until they were pointed out to me. So it would be really good to have some security experts comment on whether this is safe or not.
why are we even asking about "set order" rather than the benefits of determinism in general?
Because this entire discussion is motivated by the OP who wants consistent set order across multiple runs of his Python application. That's what he needs; having None hash to a constant value is just a means to that end. His sets contain objects whose hashes depend on `Optional[int]`, which means sometimes they include None, and when he runs under an interpreter built with address randomisation, the hash of None can change. Even if we grant None a constant hash, that still does not guarantee consistent set order across runs. At best, we might get such consistent order as an undocumented and changeable implementation detail, until we change the implementation of hashing, or of sets, or of something seemingly unrelated like address randomisation. -- Steve