[Python-ideas] Dictionary views are not entirely 'set like'

Wed Apr 6 16:33:29 EDT 2016

On Apr 6, 2016 4:39 AM, "Joshua Morton" <joshua.morton13 at gmail.com> wrote:
>
> Let's start with a quick quiz: what is the result of each of the
following (on Python3.5.x)?
>
>     {}.keys() | set()  # 1
>     set() | []  # 2
>     {}.keys() | []  # 3
>     set().union(set())  # 4
>     set().union([])  # 5
>     {}.keys().union(set())  # 6
>
> If your answer was set([]), TypeError, set([]), set([]), set([]),
AttributeError, then you were correct. That, to me, is incredibly
unintuitive.
>
> Next up:
>
>     {}.keys() == {}.keys()  # 7
>     {}.items() == {}.items()  # 8
>     {}.values() == {}.values()  # 9
>     d = {}; d.values() == d.values()  # 10
>
> True, True, False, False.
>
> Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up
for debate.[1]
>
> First thing first, the behavior exhibited by #3 is a bug (or at least it
probably should be, and one I'd be happy to fix. However, before doing that
I felt it would be good to propose some questions and suggestions for how
that might be done.
>
> There are, as far as I can tell, two reasons to use a MappingView, memory
efficiency or auto-updating (a view will continue to mirror changes in the
underlying object).

a third reason:
Mapping and MutableMapping do not assume that all of the data is buffered
into RAM.
* e.g. on top of a DB
*

https://docs.python.org/3/library/collections.abc.html#collections.abc.MutableMapping

> I'll focus on the first because the second can conceivably be solved in
other ways.
>
> Currently, if I want to union a dictionaries keys with a non-set
iterable, I can do `{}.keys() | []`. This is a bug[2], the set or operator
should only work on another set. That said, fixing this bug removes the
ability to efficiently or keys with another iterable,
`set({}.keys()).update([])` for efficiency, or `set({}.keys()).union([])`
for clarity.
>
> Fixing this is simply a matter of adding a `.union` method to the
KeysView (and possibly to the Set abc as well). Although that may not be
something that is wanted. The issue, as far as I can tell, is whether we
want people converting from MappingViews to "primitives" as soon as
possible, or if we want to encourage people people to use the views for as
long as possible.
>
> There are arguments on both sides: encouraging people to use the views
causes these objects to become more complex, introducing more surface area
for bugs, more code to be maintained, etc. Further, there's there is
currently one obvious way to do things, convert to a primitive, if you're
doing any complex action. On the other hand, making MappingViews more like
the primitives they represent has positives for performance, simplifies
user code, and would probably make testing these objects easier, since many
tests could be stolen from test_set.py.
>
> My opinion is that the operators on MappingViews should be no more
permissive than their primitive counterparts. A KeysView is in various ways
more restrictive than a set, so having it be also occasionally less
restrictive is surprising and in my opinion bad. This causes the loss of an
efficient way to union a dict's keys with a list (among other methods). I'd
then add .union, .intersection, etc. to remedy this.
>
> This solution would bring the existing objects more in line with their
primitive counterparts, while still allowing efficient actions on large
dictionaries.
>
> In short:
>
>  - Is #3 intended behavior?
>  - Should it (and the others be)?
>   - As a related aside, should .union and other frozen ops be added to
the Set interface?
>  - If so, should the fix solely be a bugfix, should I do what I proposed,
or something else entirely?
>  - More generally, should there be a guiding principle when it comes to
MappingViews and similar special case objects?
>
>
> [1]: There's some good conversation in this prior thread on this issue
https://mail.python.org/pipermail/python-ideas/2015-December/037472.html.
The consensus seemed to be that making ValuesViews comparable by value is
technically infeasible (O(n^2) worst case), while making it comparable
based on the underlying dictionary is a possibility. This would be for
OrderedDict, although many of the same arguments apply for a normal
dictionary.
>
> [2]: Well, it probably should be a bug, its explicitly tested for (
https://github.com/python/cpython/blob/master/Lib/test/test_dictviews.py#L109),
whereas sets are explicitly tested for the opposite functionality (
https://github.com/python/cpython/blob/master/Lib/test/test_set.py#L92)
>
>
> Thanks, I'm looking forward to the feedback,
> Josh
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160406/640a26ab/attachment.html>