[Python-3000] Thoughts on dictionary views

Wed Feb 21 04:32:19 CET 2007

On 2/20/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> I was "eh, why bother?" prior to reading the updated PEP 3106, but now
> can see the benefit to keys(), values(), and items() returning views.
> I'm not sure I would use the added features (I don't believe I've ever
> compared the equalities of keys or values of two dictionaries separately,
> and I tend to stick to .iter*() methods), but I can also see how it
> would be useful to some users.

Thanks for the (relative) vote of confidence. Was it an update I made
to the PEP, or did you not read it at all before?

> "Guido van Rossum" <guido at python.org> wrote:
> > On 2/20/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > > The speed costs may become negligible, but I believe the main concern
> > > here is memory consumption (minimising memory usage is certainly the
> > > only reason I've ever made sure to use the dict.iter* methods).
> >
> > As I said, it's still O(N) time and space, vs. O(1) for creating a view.
>
> But that's only if the .keys(), .values(), .items() produced actual sets
> versus producing a view.

Which (producing an actual set) is Nick's (and Raymond's) proposal.

> In the case of .values(), I don't see how one
> can do *any* better than O(n) for the a.values() == b.values() (or
> really O(nlogn) for comparable objects, and O(n^2) when they are not).

The PEP's algorithm for comparing values in O(N**2); I'm not sure it's
worth attempting to optimize it, since I'm not aware of any use case;
but it still seems better to do this than to compare values views by
object identity.

> There are going to be special cases that ruin performance with all 3
> options (use Python 2.x equivalent .iter*(), use a view, use a set
> variant).  While I can *see* the use of views, I can also see the
> benefit with just renaming .iter*() as .*() .

Name a special case that ruins performance with either PEP 3106 or
renaming .iter*() to .*()?

Methinks that both views and iterators can be optimal, at least for
.keys() and .items()), certainly in terms of O() notation; there may
be edge cases where many many lookups are done, where making a copy
into a set could be faster, but that requires the total # of lookups
to be much larger than N.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)