[Python-3000] Thoughts on dictionary views

Tue Feb 20 09:55:57 CET 2007

The Java concept of dictionary views seems to have caught-on here while I wasn't 
looking.  At the risk of covering some old ground, I would like to re-open the 
question.  Here are a few thoughts on the subject to kick-off the discussion:

* Maintaining a live (self-updating) view is a bit tricky from an implementation 
point-of-view.  While it is clearly doable for dictionaries, it is not clear 
that it is a good idea for a general mapping API which can be wrapped around 
dbms, shelves, elementtrees, b-trees, and other wrascally rabbits.  I doubt that 
the underlying structures of other mapping types support the observer pattern 
necessary to keep views updated -- this is doubly true if the underlying data is 
on disk and can be updated by other processes, threads, etc.

* One of the purported benefits is to provide set-like behavior without the 
expense of copying to a new set object.  FWIW, I've updated the set 
implementation to be more interoperable with dictionaries so that the conversion 
costs are negligible (about the same as a dict resize operation -- one pass, no 
calls to PyObject_Hash, insertion into a presized, sparse table with very few 
collisions).

* A dict is also one of Python's most basic APIs (along with lists).  Ideally, 
we should keep those two APIs as simple as possible (getting rid of setdefault() 
and unneeded methods is a step in the right direction).  IMO, the views will be 
the hardest part of the API to explain and interact with when learning the 
language -- to learn about dicts and lists, you already have to learn about 
mutability and hashability -- it doesn't help this situation if you then need to 
learn about self-updating views that can be deleted, have modified values, but 
cannot be added, and that have their own set-like operations but aren't really 
sets . . .

* ISTM that views offer three benefits:  re-iterability, set behavior, and 
self-updates.  IMO, the first is not commonly needed and is trivially served by 
writing list(mydict.items()) or somesuch.  The second is best served by an 
explicit conversion to a set or frozenset type -- those two types have been 
enormously successful in that they seem to offer a near zero learning curve --  
people seem to intuitively know how to use them right out of the box.  As long 
as that conversion is fast, I think the explicit conversion is the way to go --  
it is the way you would do it with any other Python type where you wanted set 
behavior.  Adding a handful of set methods to dict views would only complicate 
an otherwise simple situation and introduce unnecessary complexity (i.e. what 
should isinstance(d.d_keys, set) return?).  The third benefit (self-updates) is 
more interesting and does not have a direct analog with existing python tools, 
so the question is how valuable is self-updating behavior and are there 
compelling use cases that warrant a more complex API?

My recommendation is to take a more conservative route.  Let's make dicts as 
simple as possible and then introduce a new collections module entry with the 
views bells and whistles.  If the collections version proves itself as 
enormously popular, useful, understandable, and without a good equivalent, then 
it can ask for a promotion.  The collections module is nice place to put in 
alternate datatypes that meet the more demanding needs of advanced users who 
know exactly what they want/need in terms of special behaviors or performance. 
And, if we take the collections module route, there is no reason that it cannot 
be put into Py2.6 where people will either flock to it or ignore it, with either 
result providing us with good guidance for Py3.0.

my-two-cents,

Raymond