ordered set 2
Thank you for your detailed response and I'm sorry about the late reply.
All of your points make sense to me. My implementation has not been battle-tested yet.
As I wrote in a previous mail, there is only one benchmark in pyperformance was affected significantly. (My implementation was 13% faster.)
After that, I wrote some microbenchmarks. Some run faster on the current implementation and some run faster on the ordered implementation. https://github.com/methane/sandbox/tree/master/python/ordered-set
For example, current implementation is at most 20% faster on creating new set by consecutive integers. (e.g. `set(range(N))`) As you know, it is because hash(int) is consecutive too.
On the other hand, ordered implementation is at most 17% faster on creating new set by string values.
But I don't know how this microbenchmark results affects real world set workloads. Pyperformance doesn't have enough variations of set worklods.
Would you please tell me how did you gather vary set workloads?
- To get the same lookup performance, the density of index table would need to go down to around 25%. Otherwise, there's no way to make up for the extra indirection and the loss of cache locality.
Yes. Currently, I chose capacity ratio=50%, and I removed 4X resize on small sets. So density is about 25~50% for now. Performance of simple lookup is 5~8% slower.
More small capacity ratio may improve performance a bit, but it tends worse memory efficiency when table is 32bit or 64bit.
- There was a small win on iteration performance because its cheaper to loop over a dense array than a sparse array (fewer memory access and elimination of the unpredictable branch). This is nice because iteration performance matters in some key use cases.
Yes. And it can be huge performance benefit on extreme cases. (e.g. https://bugs.python.org/issue32846)
- I gave up on ordering right away. If we care about performance, keys can be stored in the order added; but no effort should be expended to maintain order if subsequent deletions occur. Likewise, to keep set-to-set operations efficient (i.e. looping over the smaller input), no order guarantee should be given for those operations. In general, we can let order happen but should not guarantee it and work to maintain it or slow-down essential operations to make them ordered.
I agree. Maybe, set shouldn't guarantee preserving insertion order unlike dict. It loses almost half of benefit of new implementation :-( But order is stable in most cases anyway regardless hash randomization. Stable pyc can be compiled without PYTHONHASHSEED=0, and sets in logs are almost stable and readable.
- Compacting does make sets a little smaller but does cost an indirection and incurs a cost for switching index sizes between 1-byte arrays, 2-byte arrays, 4-byte arrays, and 8-byte arrays. Those don't seem very expensive; however, set lookups are already very cheap when the hash values are known (when they're not, the cost of computing the hash value tends to dominate anything done by the setobject itself).
Yes. Ordered implementation makes simple case slower. But memory efficiency is very different if application uses tons of small sets.
- I couldn't find any existing application that would notice the benefit of making sets a bit smaller. Most applications use dictionaries (directly or indirectly) everywhere, so compaction was an overall win. Sets tend to be used more sparsely (no pun intended) and tend to be only a small part of overall memory usage. I had to consider this when bumping the load factor down to 60%, prioritizing speed over space.
You're right. Number of set objects in Python interpreter is very few than dict. And ordered set is not memory efficient for large set.
Maybe, we couldn't find clear net win by this set implementation. I will stop this work at some point.
-- Inada Naoki email@example.com