[New-bugs-announce] [issue46713] Provide a C implementation of collections.abc.KeysView and friends

Joshua Bronson report at bugs.python.org
Thu Feb 10 14:19:45 EST 2022

New submission from Joshua Bronson <jabronson at gmail.com>:

As suggested by @rhettinger in https://bugs.python.org/msg409443, I'm creating a feature request for C implementations of collections.abc.KeysView, ValuesView, and ItemsView.

Because these do not currently benefit from C speedups, they're a lot slower than their dict_keys, dict_values, and dict_items counterparts. As a result, libraries that implement custom Mapping types that are backed by dicts are incentivized to override the implementations of keys(), values(), and items() they inherit from collections.abc.Mapping to instead return their backing dicts' mapping views, causing a potential abstraction leak.

An example can be found in https://github.com/jab/bidict, which implements bidirectional mapping types that wrap a forward and an inverse dict which are kept in sync with one another.

>>> from bidict import *
>>> bi = bidict({1: 'one', 2: 'two'})
>>> bi.items()  # Overridden for performance:
dict_items([(1, 'one'), (2, 'two')])

Ditto for OrderedBidict:

>>> OrderedBidict(bi).keys()
_OrderedBidictItemsView(OrderedBidict([(1, 'one'), (2, 'two')]))

(The _OrderedBidictItemsView is a custom view whose __iter__ uses the implementation inherited by its collections.abc.ItemsView base class so that the correct order is respected, but proxies other method calls through to the backing dict's dict_items object: https://github.com/jab/bidict/blob/2ab42a/bidict/_orderedbidict.py#L90-L150)

Here is a microbenchmark of calling __eq__ on an _OrderedBidictItemsView vs. a collections.abc.ItemsView, to estimate the performance impact (using Python 3.10):

❯ set setup '
    from collections.abc import ItemsView
    from bidict import OrderedBidict
    d = dict(zip(range(9999), range(9999)))
    ob = OrderedBidict(d)'

❯ python -m pyperf timeit -s $setup 'ob.items() == d.items()' -o 1.json

❯ python -m pyperf timeit -s $setup 'ItemsView(ob) == d.items()' -o 2.json

❯ pyperf compare_to 2.json 1.json
Mean +- std dev: [2] 4.21 ms +- 1.10 ms -> [1] 168 us +- 6 us: 25.13x faster

This demonstrates a potentially significant speedup. Similar microbenchmarks for ItemsView vs. dict_items, as well as KeysView vs. both dict_keys and _OrderedBidictKeysView, also indicate similarly significant potential.

Note that the performance benefits of this may propagate to other code as well. For example, bidicts' __eq__ methods are implemented in terms of their itemsviews (see https://github.com/jab/bidict/blob/2ab42a/bidict/_base.py#L285-L286), so speeding up bidict.items().__eq__ speeds up bidict.__eq__ commensurately.

messages: 413020
nosy: jab
priority: normal
severity: normal
status: open
title: Provide a C implementation of collections.abc.KeysView and friends

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list