[Python-ideas] Dictionary views are not entirely 'set like'

Wed Apr 6 05:38:35 EDT 2016

Let's start with a quick quiz: what is the result of each of the following
(on Python3.5.x)?

    {}.keys() | set()  # 1
    set() | []  # 2
    {}.keys() | []  # 3
    set().union(set())  # 4
    set().union([])  # 5
    {}.keys().union(set())  # 6

If your answer was set([]), TypeError, set([]), set([]), set([]),
AttributeError, then you were correct. That, to me, is incredibly
unintuitive.

Next up:

    {}.keys() == {}.keys()  # 7
    {}.items() == {}.items()  # 8
    {}.values() == {}.values()  # 9
    d = {}; d.values() == d.values()  # 10

True, True, False, False.

Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up
for debate.[1]

First thing first, the behavior exhibited by #3 is a bug (or at least it
probably should be, and one I'd be happy to fix. However, before doing that
I felt it would be good to propose some questions and suggestions for how
that might be done.

There are, as far as I can tell, two reasons to use a MappingView, memory
efficiency or auto-updating (a view will continue to mirror changes in the
underlying object). I'll focus on the first because the second can
conceivably be solved in other ways.

Currently, if I want to union a dictionaries keys with a non-set iterable,
I can do `{}.keys() | []`. This is a bug[2], the set or operator should
only work on another set. That said, fixing this bug removes the ability to
efficiently or keys with another iterable, `set({}.keys()).update([])` for
efficiency, or `set({}.keys()).union([])` for clarity.

Fixing this is simply a matter of adding a `.union` method to the KeysView
(and possibly to the Set abc as well). Although that may not be something
that is wanted. The issue, as far as I can tell, is whether we want people
converting from MappingViews to "primitives" as soon as possible, or if we
want to encourage people people to use the views for as long as possible.

There are arguments on both sides: encouraging people to use the views
causes these objects to become more complex, introducing more surface area
for bugs, more code to be maintained, etc. Further, there's there is
currently one obvious way to do things, convert to a primitive, if you're
doing any complex action. On the other hand, making MappingViews more like
the primitives they represent has positives for performance, simplifies
user code, and would probably make testing these objects easier, since many
tests could be stolen from test_set.py.

My opinion is that the operators on MappingViews should be no more
permissive than their primitive counterparts. A KeysView is in various ways
more restrictive than a set, so having it be also occasionally less
restrictive is surprising and in my opinion bad. This causes the loss of an
efficient way to union a dict's keys with a list (among other methods). I'd
then add .union, .intersection, etc. to remedy this.

This solution would bring the existing objects more in line with their
primitive counterparts, while still allowing efficient actions on large
dictionaries.

In short:

 - Is #3 intended behavior?
 - Should it (and the others be)?
  - As a related aside, should .union and other frozen ops be added to the
Set interface?
 - If so, should the fix solely be a bugfix, should I do what I proposed,
or something else entirely?
 - More generally, should there be a guiding principle when it comes to
MappingViews and similar special case objects?

[1]: There's some good conversation in this prior thread on this issue
https://mail.python.org/pipermail/python-ideas/2015-December/037472.html.
The consensus seemed to be that making ValuesViews comparable by value is
technically infeasible (O(n^2) worst case), while making it comparable
based on the underlying dictionary is a possibility. This would be for
OrderedDict, although many of the same arguments apply for a normal
dictionary.

[2]: Well, it probably should be a bug, its explicitly tested for (
https://github.com/python/cpython/blob/master/Lib/test/test_dictviews.py#L109),
whereas sets are explicitly tested for the opposite functionality (
https://github.com/python/cpython/blob/master/Lib/test/test_set.py#L92)

Thanks, I'm looking forward to the feedback,
Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160406/b41b3f41/attachment.html>