
To add some numbers to this discussion. Using the patch I provided earlier, I tried running some performance comparisons against different call patterns & dictionary sizes: These tests have two versions: direct_index: uses the __getitem__ method in the patch (O(n)) on the view list_copy: converts the view into list(), then does an index lookup Each score is the fastest result of running the test 10 times. Accessing the last item [-1] using the __getitem__ approach is the *worst-case scenario* for the __getitem__ method. d.keys()[-1] vs list(d.keys())[-1] Items in dict: 1 direct_index: took 1us list_copy: took 2us Speedup: 1.85x Items in dict: 26 direct_index: took 1us list_copy: took 2us Speedup: 2.18x Items in dict: 650 direct_index: took 1us list_copy: took 6us Speedup: 7.68x Items in dict: 15,600 direct_index: took 6us list_copy: took 119us Speedup: 19.38x Items in dict: 358,800 direct_index: took 129us list_copy: took 6065us Speedup: 47.12x Items in dict: 7,893,600 direct_index: took 2827us list_copy: took 210419us Speedup: 74.43x random.choice(d.items()) vs random.choice(list(d.items())) Items in dict: 1 direct_index: took 3us list_copy: took 3us Speedup: 1.23x Items in dict: 26 direct_index: took 3us list_copy: took 5us Speedup: 1.75x Items in dict: 650 direct_index: took 5us list_copy: took 19us Speedup: 4.23x Items in dict: 15,600 direct_index: took 7us list_copy: took 2771us Speedup: 377.55x Items in dict: 358,800 direct_index: took 112us list_copy: took 76626us Speedup: 682.33x Items in dict: 7,893,600 direct_index: took 1667us list_copy: took 1602864us Speedup: 961.37x d.items()[0] vs list(d.items())[0] Items in dict: 1 direct_index: took 1us list_copy: took 1us Speedup: 1.75x Items in dict: 26 direct_index: took 1us list_copy: took 2us Speedup: 1.84x Items in dict: 650 direct_index: took 1us list_copy: took 19us Speedup: 30.87x Items in dict: 15,600 direct_index: took 1us list_copy: took 2867us Speedup: 4777.73x Items in dict: 358,800 direct_index: took 1us list_copy: took 73268us Speedup: 112719.85x Items in dict: 7,893,600 direct_index: took 1us list_copy: took 1604230us Speedup: 2765900.12x So, use-case discussions aside, you get some pretty convincing performance wins even on small-ish dictionaries if direct indexing is used. Steve On Wed, Jul 8, 2020 at 3:11 PM Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Jim Baker writes:
We should keep the most heavily accessed object type in Python as lightweight as possible, and then build interesting structures around it, just like we always do.
+1
But we're not talking about the dict itself in this subthread. We're talking about views, and the implementation of comparing items containing non-hashable values for equality will be done in a view method which is already worst-case O(n), not a dict method.
Note that the current implementation means that an equality comparison can raise, and that's a very bad thing. If you look at the thread Inada-san cited a couple posts back, you'll see some very strong statements in support of the proposition that == should *never* raise. I find that persuasive.
I'm still against the implementation of "values view as sequence", since all the alleged use cases are of the form "maybe you'd want this someday for a really big dict so you don't want to listify so you can sort/index/etc." Show me real code, preferably production (ie, used at work -- might even be a throwaway script) and I'll concede there's an argument. But nothing abstract is gonna move me even one Planck unit from -1. (That's just my opinion, as usual YMMV etc) _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/A6XVKF... Code of Conduct: http://python.org/psf/codeofconduct/