[Python-3000] PEPs 3106 and 3119

Levi levi at gis.net
Wed Aug 20 21:45:24 CEST 2008


Recently I've taken some interest in PEP 3106 (revamping dick.keys, 
etc.) and wrote a rough draft of an experimental implementation in 
Python. Along the way I noticed a few things that I think need some 
discussion. A lot of them have to do with the interaction with PEP 3119 
(Abstract Base Classes).

While 3119 clearly incorporates most of the ideas in 3106, the reverese 
isn't the case (which I suppose makes sense, as 3106 is older.) However, 
there are some inconsistancies I've noticed.

In particular, 3119 specifies that Mapping.values() should return a 
Sized, Iterable, Container, but Guido's code in 3106 suggests that 
equality (and possibly other operations) should be defined as well. In 
this particular case, I thing there really ought to be a more clearly 
defined collection interface for the result of Mapping.values, 
especially if operations beyond those of Sized, Iterable, and Container 
are to be supported.

Another thing I noticed is that 3106 assumes that set, frozenset, and 
the various dictionary view objecs are the only sets like objects 
dictionary views will need to interact with, while 3119 implies that 
they really ought to play nice with any Set. To that end I implimented a 
base Set class that defines the set operations in terms of only __len__, 
__iter__, and __contains__. The keys and items classes inherit this 
behavior, so they can interact with anything supporting the Set interface.

However, there is one major wrinkle with this solution. The builtin set 
and frozenset types don't play nice with others. In particular, __eq__ 
and other such operations don't return NotImplemented when they ought 
to. Therefore, even though my view objects have suitably generic methods 
that could be used instead, they won't get called (in some cases) 
because set returns improper results. This leads  to some strange 
behavior, such as this:

 >>> d = dict(one=1, two=2, three=3)
 >>> s = set(('one', 'two', 'three'))
 >>> d.keys() == s
True
 >>> s == d.keys()
False

Now, I understand that set doesn't return NotImplemented to avoid having 
it's __cmp__ method called, but what I don't get is why it has a __cmp__ 
method at all. I thought the entire point of set and co. using the rich 
comparison operators is that Sets only define a partial ordering and so 
shouldn't define __cmp__, which implies a total ordering. So why define 
a __cmp__ method that only raises an error at the expense of breaking 
the rich comparison operators?

While a dict's keys view is guaranteed to only have hashable elements 
and can return a set (or frozenset) from it's set operations (union et 
al), the items view cannot. The solution (or at least the semantics 
thereof) supplied by Guido in PEP 3106 is to construct a new dict and 
return it's item view. What I did instead was to create a (poor, list 
based) Set type object to return instead. I think that the final 
implementaion should do something like this, however, the returned type 
should be implemented as effeciently as is reasonable and be a new 
standard builtin type. What said type should be called and how to 
implement it in a performant manner, I'm not so sure about. Any suggestions?

Additionally, while dict's keys are Hashable, another Mapping type's 
keys may not be, so such a Set type would also be useful in that case.

While implementing my (psudo) PEP 3119 Set base class, I noticed 
something strange. The specification states that issuperset and issubset 
aren't included, because they're basically just synonyms for __ge__ and 
__le__. However, no mention of union and the other named operations is 
made, which are basically in the same boat, is made. The asymetry seems 
a little odd to me.

The actually semantic differnce between the implicit methods and the 
explicit ones is that the explicit ones will coerce any (suitable) 
Iterable into the required Set type, while the implicit ones will not. 
Whether such operations should be required is up for debate, but I find 
it odd that some of the named operations (is...set)  are removed while 
the others (union etc.) are not. What is the reasoning for this?

Finally, you can check out my implementation at 
"http://gis.net/~levi/code/py3k/". It's not the most presentable code at 
the moment, and no doubt has all sorts of missing fuctionality, poor 
performance, and insufficent test coverage, but it's something at least. 
It provides an implimentation of most of PEP 3106 and a tiny subset of 
3119 and a small (unittest based) test suit. Unfortunatly it's 
undocumented, but it should be pretty easy to follow. I'll be working on 
improving it sometime soon. In the meantime any suggestions, critques, 
or comments will be greatly appreciated.

Thanks in Advance,
       Levi Aho



More information about the Python-3000 mailing list