[Python-Dev] API for binary operations on Sets
debatem1 at gmail.com
Thu Sep 30 06:11:50 CEST 2010
On Wed, Sep 29, 2010 at 8:50 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> I would like to solicit this group's thoughts on how to reconcile the Set
> abstract base class with the API for built-in set objects
> (see http://bugs.python.org/issue8743 ). I've been thinking about this
> issue for a good while and the RightThingToDo(tm) isn't clear.
> Here's the situation:
> Binary operators for the built-in set object restrict their "other" argument
> to instances of set, frozenset, or one of their subclasses. Otherwise,
> they return NotImplemented. This design was intentional (i.e. part of the
> original pure python version, it is unittested behavior, and it is a
> documented restriction). It allows other classes to "see" the
> NotImplemented and have a chance to take-over using __ror__, __rand__, etc.
> Also, by not accepting any iterable, it prevents little coding
> atrocities or possible mistakes like "s | 'abc'". This is a break with what
> is done for lists (Guido has previously lamented that list.__add__ accepting
> any iterable is one of his "regrets"). This design has been in place for
> several years and so far everyone has been happy with it (no bug reports,
> feature requests, or discussions on the newsgroup, etc). If someone needed
> to process a non-set iterable, the named set methods (like intersection,
> update, etc) all accept any iterable value and this provides an immediate,
> usable alternative.
> In contrast, the Set and MutableSet abstract base classes in Lib/_abcoll.py
> take a different approach. They specify that something claiming to be
> set-like will accept any-iterable for a binary operator (IOW, the builtin
> set object does not comply). The provided mixins (such as __or__, __and__,
> etc) are implemented that way and it works fine. Also, the Set and
> MutableSet API do not provide named methods such as update, intersection,
> difference, etc. They aren't really needed because the operator methods
> already provide the functionality and because it keeps the Set API to a
> reasonable minimum.
> All of this it well and good, but the two don't interoperate. You can't get
> an instance of the Set ABC to work with a regular set, nor do regular sets
> comply with the ABC. These are problems because they defeat some of the
> design goals for ABCs.
> We have a few options:
> 1. Liberalize setobject.c binary operator methods to accept anything
> registered to the Set ABC and add a backwards incompatible restriction to
> the Set ABC binary operator methods to only accept Set ABC instances (they
> currently accept any iterable).
> This approach has a backwards incompatible tightening of the Set ABC, but
> that will probably affect very few people. It also has the disadvantage of
> not providing a straight-forward way to handle general iterable arguments
> (either the implementer needs to write named binary methods like update,
> difference, etc for that purpose or the user will need to cast the the
> iterable to a set before operating on it). The positive side of this
> option is that keeps the current advantages of the setobject API and its
> NotImplemented return value.
> 1a. Liberalize setobject.c binary operator methods, restrict SetABC
> methods, and add named methods (like difference, update, etc) that accept
> any iterable.
> 2. We could liberalize builtin set objects to accept any iterable as an
> "other" argument to a binary set operator. This choice is not entirely
> backwards compatible because it would break code depending on being able run
> __ror__, __rand__, etc after a NotImplemented value is returned. That being
> said, I think it unlikely that such code exists. The real disadvantage is
> that it replicates the problems with list.__add__ and Guido has said before
> that he doesn't want to do that again.
> I was leaning towards #1 or #1a and the guys on IRC thought #2 would be
> better. Now I'm not sure and would like additional input so I can get this
> bug closed for 3.2. Any thoughts on the subject would be appreciated.
I'm not clear on what the issues with list.__add__ were, but my first
impression is to lean towards #2. What am I missing?
> P.S. I also encountered a small difficulty in implementing #2 that would
> still need to be resolved if that option is chosen.
What's the issue, if you don't mind me asking?
More information about the Python-Dev