
Sorry! A previous attempt to reply got sent before I typed anything :-( Very briefly:
timeit.timeit("set(i for i in range(1000))", number=100_000)
[and other examples using a range of integers] The collision resolution strategy for sets evolved to be fancier than for dicts, to reduce cache misses. This is important for sets because the _only_ interesting thing about an element wrt a set is whether or not the set contains it. Lookup speed is everything for sets. If you use a contiguous range of "reasonable" integers for keys, the integer hash function is perfect: there's never a collision. So any such test misses all the work Raymond did to speed set lookups. String keys have sufficiently "random" hashes to reliably create collisions, though. Cache misses can, of course, have massive effects on timing.
Add (much faster for dicts):
timeit.timeit("s = set(); s.add(0)", number=100_000_000) 13.330938750001224 timeit.timeit("d = {}; d[0] = None", number=100_000_000) 5.788865385999088
In the former case you're primarily measuring the time to look up the "add" method of sets (that's more expensive than adding 0 to the set). A better comparison would, e.g., move `add = s.add` to a setup line, and use plain "add(0)" in the loop. That's it!