I wrote some (better than the previously shared) benchmarks for this change a while ago. They are run on cPython with a patch applied that implements dict_views __getitem__() using a method similar to `lookdict` to perform indexing on keys/values/etc.
Irrespective of where in the api this logic should exist, the implementation won't be algorithmically different, (I think, even with a `.ordered` view, as the view would have to cope with changes to the underlying dictionary over its lifetime, and external tracking of changes to dicts is not, afaik, feasible. Unlike for-loop constructs which are inherently scoped, I feel like you wouldn't get away with forbidding modifying a dict() if there's a view on keys/values/items still alive, as these things are first-class objects that can be stored/passed around)
Therefore, all index based lookups would have to use a variant of this logic (unless someone can come up with a magic O(1) solution ;) Or explicit compaction is used (If anyone has a patch that adds tracking 'compactness' over the dict_keys, I can run the tests using it, to measure the impact - However I'm not personally sure yet if the overheads of this more invasive change are justified just for enabling indexing).
The tl/dr from my perspective is that these results make the change challenging to continue proposing without a better implementation than the obvious one. (I was weakly +1 before these results). Personally, I'm happy that the numbers give good evidence for this change being more complex than it at-first seems.
Some notes about the benchmarks, I've adapted an existing, not-related, test runner for this, so there may be some compromises. I've tried to be reasonable about capturing OK timing data, but the intent here isn't to spot single-% changes in performance, rather it's looking at significant changes in runtime performance over vastly varying sizes of dicts. The repo including the test runner, patches and makefile are here: https://github.com/stestagg/dict_index
and I'm accepting issues/PRs there if anyone feels that there's an omission or error that's worth correcting.
The numbers are raw, and *do not* have any interpretation layered on them, there are many snippets of code that are not best-practice or ideal ways of achieving things, this is as much because I wanted to see what the impact of these non-optimal patterns would be on common operations, please take the time to understand the exact test (check the full source if you need) before making any meaningful decisions based on this data.
Graphs are, by default, plotted on log-log axes, so beware when just looking at the shapes of the lines that the real-world difference in run-time is much larger than the absolute line differences suggest. The solution that uses direct indexing is always coloured green in the graphs.
As the code involves a tight loop over a simple structure which is very CPU dependent (and because I can), I've run the benchmarks on a Raspberry pi4 (ARMv7l), and on an AMD pc: