On Wed, Jul 08, 2020 at 05:44:00PM +0100, Stestagg wrote:
So, use-case discussions aside, you get some pretty convincing performance wins even on small-ish dictionaries if direct indexing is used.
This reminds me of an anecdote my wife tells me about the days when she was in a band in the USA. She and the roadies were travelling from a gig in one city to another when she noticed a highway sign:
"Aren't we travelling the wrong way?"
To which the roadie driving answered "Who cares, we're making fantastic time!"
Ah, the sixties.
Who cares if there's no practical use-cases for this feature, it's really fast! *wink*
Assuming your timing tests are accurate (a lot of people don't time their code well, and consequently get very inaccurate numbers) and representative (timing is very sensitive to the combination of Python version, OS, hardware, and load on the local machine) your results are micro-benchmarks, the least helpful benchmarks.
For a dict with 15000 items, let us accept that your suggested:
is 5000 times faster than:
Conceded! But if you're only calling it *once*, who cares? It's still only 3 milliseconds. My (hypothetical) script takes half a second to run, I'm not going to notice 3ms.
What if I call it a thousand times?
Then I ought to make a list *once* and index the list repeatedly, not keep making and throwing away the list, which gives me constant-time random access to the items:
L = dict(mydict.items()) for i in indices: L[i]
and I expect that list access is probably going to be faster than mydict.items()[i] access. So now the cost of building the list is amortized over a thousand calls, and becomes more or less invisible.
In my opinion, the reason why this proposal is not compelling include:
1. lack of a strong use-case;
2. inelegant semantic design that adds sequence-like behaviour to views which are (imperfectly) designed to be set-like;
3. and concern that adding this to views would constrain future performance improvements to the underlying dict.
(There may be others.)
Unless I have missed any others, we've only seen three use-cases:
(a) The original post wanted a full sequence API for dictionaries, with the ability to insert keys at a specific index, not just get the N-th key. Not going to happen.
(b) You've suggested "get any item", but that's probably better written as `next(mydict.items)`.
(c) And `random.choice(mydict.items())` which seems to be lacking any *concrete* use-case -- under what circumstances would we want this and care about it's performance *enough* to add this to builtin dict views?
(It seems more like toy code we might write to illustrate the use of indexed access than something with a concrete use-case behid it.)
If there was a really strong use-case, points 2 and 3 could be over- ruled. But lacking a good use-case, the conservative safe choice here is to keep the status quo and reject the proposal.
So I think that if you really want to champion this proposal, you would be better off looking for concrete, practical use-cases and macro-benchmarks, not micro.