Dictionary .keys() and .values() should return a set [with Python 3000 in mind]

Nick Vatamaniuc vatamane at gmail.com
Sat Jul 1 17:05:06 EDT 2006


> 1 - golden handcuffs.  Breaking old code is bad 90% of the time
I agree with you on  that, mostly for code that counted on list methods
of result of keys() - like you later show with sort.   But there is a
side note:  old code that assumed a particular ordering of the keys or
values is broken anyway.  So even if ks={1:0,2:0,3:0}.keys() returns
[1,2,3] on my machine I should not do something like
'my_savings_account + ks[0]' That code should be fixed anyway, since on
a different machine it might produce different values for ks[0].

> 2 - creating a set MAY be slower.
Creating a set from the dictionary's keys should not be a lot slower
because the keys are already unique, there is no need to check each key
against the other keys just return them as a set.  I assume this is
what you meant by "make the set hashmap have the same dimensions and
rules as the dictionary one".  Perhaps a dictionary would internally
just copy its  keys to the set and return it rather than construct as
set from scratch (with duplication checks and all).

>3 - using a set is sometimes slower
Again, depending how it is used.  In your case you argue that you
usually sort the keys anyway so a list is more convinient.  But
different use cases can call for differnent operations on the keys()
after they have been retrieved. What if someone wants to do an
intersection to find common keys with another dictionary, then a set
would be more appropriate. The intent of the set() type was to not temp
anyone into assuming an ordering of keys() just because a list is
indexable. And eliminate the need for a long footnote in the
documentation of the dict type that talks about 'an arbitrary
non-random ordering' - it takes while just to grasp what that means...

In general I believe that a small performance penalty is acceptable in
order to have a correct and consistent data type, especially for Python
i.e. I might not argue the same for Perl or C.

-Nick V.

cmdrrickhunter at yaho.com wrote:
> There's a few good reasons.
> 1 - golden handcuffs.  Breaking old code is bad 90% of the time
> 2 - creating a set MAY be slower.
>
> Python's sets seem to imply to that they will always be a hash map.  in
> this case, some creative hash map "mapping" could allow one to create a
> set without calculating hash codes (make the set hashmap have the same
> dimentions and rules as the dictionary one).
> If there was intent to allow Python implementations to use trees for
> the set, then a list is far faster to create (O(n) time instead of
> O(nlogn)).
>
> 3 - using a set is sometimes slower (just as using a list is sometimes
> slower)
> I can't speak for your code, but this is the most common use of keys in
> my coding:
> # d is some dictionary
> keys = d.keys()
> keys.sort()
> for k in keys:
>   #blah
>
> sets cannot be sorted, while lists can.  If keys() returned a set, then
> I'd have to turn it into a list every time.
>
> There's potential to add "views" to python (a key view of a dictionary
> being a frozenset containing the dictionary's keys, which is updated
> whenever the dictionary updates), but I think thats annother topic
> which is out of the scope of your origional idea.




More information about the Python-list mailing list