[Python-3000] Thoughts on dictionary views

Guido van Rossum guido at python.org
Wed Feb 21 00:46:11 CET 2007


On 2/20/07, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> The Java concept of dictionary views seems to have caught-on here while I wasn't
> looking.  At the risk of covering some old ground, I would like to re-open the
> question.

Because it's coming from you I am reopening the discussion; didn't
mean to exclude you. But it is a bit of a pain that you weren't
looking while this was discussed. If we decide to roll it back it will
be painful (depending on the shape of the roll-back).

> Here are a few thoughts on the subject to kick-off the discussion:
>
> * Maintaining a live (self-updating) view is a bit tricky from an implementation
> point-of-view.  While it is clearly doable for dictionaries, it is not clear
> that it is a good idea for a general mapping API which can be wrapped around
> dbms, shelves, elementtrees, b-trees, and other wrascally rabbits.  I doubt that
> the underlying structures of other mapping types support the observer pattern
> necessary to keep views updated -- this is doubly true if the underlying data is
> on disk and can be updated by other processes, threads, etc.

No observer pattern is required. See the (updated) PEP 3106.

> * One of the purported benefits is to provide set-like behavior without the
> expense of copying to a new set object.  FWIW, I've updated the set
> implementation to be more interoperable with dictionaries so that the conversion
> costs are negligible (about the same as a dict resize operation -- one pass, no
> calls to PyObject_Hash, insertion into a presized, sparse table with very few
> collisions).

But it is still O(N) in time and space. Creating a dict view is O(1) in both.

> * A dict is also one of Python's most basic APIs (along with lists).  Ideally,
> we should keep those two APIs as simple as possible (getting rid of setdefault()
> and unneeded methods is a step in the right direction).  IMO, the views will be
> the hardest part of the API to explain and interact with when learning the
> language -- to learn about dicts and lists, you already have to learn about
> mutability and hashability -- it doesn't help this situation if you then need to
> learn about self-updating views that can be deleted, have modified values, but
> cannot be added, and that have their own set-like operations but aren't really
> sets . . .

Perhaps it will be more palatable now that the views aren't mutable?
Also, I think you may have the wrong semantic model -- it's not a
self-updating view, it's just a different way to look at the same
underlying mapping. (Did you see PEP 3106? Since you don't quote it
this is not clear.)

> * ISTM that views offer three benefits:  re-iterability, set behavior, and
> self-updates.  IMO, the first is not commonly needed and is trivially served by
> writing list(mydict.items()) or somesuch.  The second is best served by an
> explicit conversion to a set or frozenset type -- those two types have been
> enormously successful in that they seem to offer a near zero learning curve --
> people seem to intuitively know how to use them right out of the box.

Yes, Greg Wilson did a super job on the API design.

(Though I keep having to remind Googlers about sets; they seem to have
lived in a world limited to Python 2.2 for too long. :-( )

> As long
> as that conversion is fast, I think the explicit conversion is the way to go --
> it is the way you would do it with any other Python type where you wanted set
> behavior.  Adding a handful of set methods to dict views would only complicate
> an otherwise simple situation and introduce unnecessary complexity (i.e. what
> should isinstance(d.d_keys, set) return?).

This I hope to address by introducing Abstract Base Classes.
Unfortunately that proposal isn't at all worked out, the best we have
is a wiki page by Bill Janssen (), but that is quite far removed from
what I would like to see.

> The third benefit (self-updates) is
> more interesting and does not have a direct analog with existing python tools,
> so the question is how valuable is self-updating behavior and are there
> compelling use cases that warrant a more complex API?

Just because it's new doesn't make it suspect does it? It's been very
well received in Java.

> My recommendation is to take a more conservative route.  Let's make dicts as
> simple as possible

I'd like to see a concrete proposal here before I can judge which is
the better proposal.

>  and then introduce a new collections module entry with the
> views bells and whistles.  If the collections version proves itself as
> enormously popular, useful, understandable, and without a good equivalent, then
> it can ask for a promotion.  The collections module is nice place to put in
> alternate datatypes that meet the more demanding needs of advanced users who
> know exactly what they want/need in terms of special behaviors or performance.

But that's not what dict views are about. They ar about making the
mapping API easier for *all* users. (Anyway, Greg Ewing already shot
this down.)

> And, if we take the collections module route, there is no reason that it cannot
> be put into Py2.6 where people will either flock to it or ignore it, with either
> result providing us with good guidance for Py3.0.

Dict views can easily be added to 2.6 by using different method names
that can be automatically converted by the 2to3 converter. E.g.
d.viewkeys(), d.viewitems(), d.viewvalues(). The implementation should
plug right in. (Anthony and Thomas also have some more advanced ideas
on how to make keys/items/values return views when used in a module
declaring "from __future__ import dict_views".)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list