[Python-3000] Need help completing ABC pep

Jim Jewett jimjjewett at gmail.com
Thu Apr 26 02:19:03 CEST 2007


On 4/25/07, Guido van Rossum <guido at python.org> wrote:
> On 4/21/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> [PEP 3119]
> > > > * "Should we also implement the issubset and issuperset methods found
> > > > on the set type in Python 2? As these are just aliases for __le__ and
> > > > __ge__, I'm tempted to leave these out."
> [Brett]
> > > > Leave them out.  Not terribly needed plus it is better to start out
> > > > small.  They can easily be added later if called for.

> > I think the names are more sensible than repurposing the numeric
> > operators.  The "concrete" implementation can just forward to other
> > name, so it doesn't cost much.

> But the numeric operators are what everybody uses. So it makes more
> sense to define these, and only these.

ahh... because you want to ensure that implementors override the correct one.
If you defined

    def union(self, other):  return self.__or__(other)

and I overrode union but not __or__, things would be inconsistent.
But that's still true even if you don't define union in the ABC.
Maybe add a sanity check in the actually-doing-something default
implementation?

    def __or__(self, other):
        # If they overrode union but not __or__, honor the override.
        if self.union is not __this_class__.union:
            return self.union(other)
        new = set(self)
        new.update(other)
        return frozenset(new)


> > In the Hashable section:

> > > Another constraint is that hashable objects, once created,
> > > should never change their value (as compared by ==) or their hash
> > > value. If a class cannot guarantee this, it should not derive from
> > > Hashable; if it cannot guarantee this for certain instances only,
> > > __hash__ for those instances should raise a TypeError exception.

> > Why not just return -1 (unless/until the value is stable)?  Is the -1
> > special case being phased out too?

> I don't recall that -1 was ever a special case at the Python level;
> that was only done at the C level to make error checking easier.
> Returning any hash value at all implies a guarantee that the object
> value won't ever change (not just the hash value!); if you can't
> guarantee that (e.g. for hash(([], []))) then you shouldn't return a
> hash value. I don't think that hashing all objects with an unstable
> value together on -1 supports a valid use case.

If you want to enforce that, it is worth documenting.  As nearly as I
can tell, hash is used only by dict and set, and they always treat a
hash of -1 as unhashable, so it won't get into the dictionary (or
set).

To me, this suggested it was OK to change value (if both values have
the same hash), and it was OK to (once per object) change hash from -1
to a real value.

I have used this for objects where:
    (a)  The most significant attributes are immutable and known at creation
    (b)  Additional attributes are not known at creation, so they may
change from "None" to a value.
    (c)  Equality depends on all attributes, and therefore could --
theoretically -- change.
    (d)  In practice, several objects might pass through the same
(partially completed) value, but only one would be in that state at a
time.  (In other words, the result of a==b was stable, even though the
result of a<b wasn't.)

I based the hash on only the known-at-creation time attributes; if
they were unequal, than the objects could never be equal, which was
all hash needed to tell me.  If the differences were in the later
attributes -- oh well; I got some extra hash collisions.

> What is your specific proposal here? Adding
> Searchable(Container) with the implication that __contains__ takes a
> sequence as well as a single value?

Yes.

> How should the type of the argument be described?

Sequence would be fine if you want to leave yourself room to change
your mind later.  I wouldn't bother to describe it at all.

    def __contains__(self, item):
        atomic = super(Searchable, self).__contains__(item)  # or __this_class__
        if atomic:
            return atomic
        try:
            it = iter(item)
        except TypeError:
            return False # It isn't iterable, so give up
        # Note that this doesn't assume Sequence.  It would say "bbbbaa" in "ab"
        return all(super(Searchable, self).__contains__(elt) for elt in it)


> > Why must a HashableSet or MutableSet be composable?  If this is
> > because you figure any useful set has those properties (and I don't
> > think so, when doing uniquification), then Set and ComposableSet
> > should become BasicSet and Set?

> Hmm, good point! This is in fact listed as an open issue for
> HashableSet. If we can find a good use case for non-composable sets
> that should nevertheless be hashable, we may end up with four classes:
> set, composable set, hashable set, composable hashable set. (I think
> you're hinting at an example; can you work it out a bit more?)

Yes, but not immediately.  :D

> > Should Sequence.index take optional start and stop arguments?

> Good question. I'm tempted to say no, and consider that a
> list-specific extension.

It is also a pretty important string method, unless it has been
deprecated for find.

-jJ


More information about the Python-3000 mailing list