Copying objects style questions

Wed Aug 6 03:56:26 EDT 2003

In dnspython I have a set class, SimpleSet.  (I don't use Python 2.3's
sets.Set class so I can keep supporting Python 2.2, and because the
objects in my sets are mutable).  The SimpleSet class has a single
attribute, "items", which is a list.  (I know a list is not going to
lead to fast set operations in general, but my typical set has only
one or two elements in it, so the potential performance issues don't
really matter for my needs.)

I then subclass SimpleSet to make other kinds of sets, e.g. RRset
subclasses Rdataset which subclasses SimpleSet.  RRsets and Rdatasets
each add additional attributes.

I want to have a copy operation which is an "almost shallow" copy.
Specifically, all of the attributes of the object may be shallow
copied except for one, the 'items' list of the SimpleSet, for which I
want a new list containing references to the same elements, so that
the user of the copy may add or remove elements subsequently without
affecting the original.

I can't use copy.copy()'s default behavior, because it is too shallow.
I don't want to use copy.deepcopy() because it's too deep.  I
contemplated __copy__, __initargs__, __getstate__, and __setstate__,
but they didn't seem to fit the bill, or seemed more complicated than
the solution I ended up with (see below).

I can, of course, write my own copy() method, but I don't want to
require each subclass of Set have to make a copy() method which
implements the entire copying effort.  Rather I'd like cooperating
superclasses; I'd like RRset to copy the name, and then let Rdataset
copy its attributes, and then let SimpleSet do the copy of the items
attribute.

My first solution was like clone() in Java:

        In SimpleSet:

                def copy(self):
                    """Make a (shallow) copy of the set.

                    There is a 'copy protocol' that subclasses of
                    this class should use.  To make a copy, first
                    call your super's copy() method, and use the
                    object returned as the new instance.  Then
                    make shallow copies of the attributes defined
                    in the subclass.

                    This protocol allows us to write the set
                    algorithms that return new instances
                    (e.g. union) once, and keep using them in
                    subclasses.
                    """

                    cls = self.__class__
                    # I cannot call self.__class__() because the
                    # __init__ method of the subclasses cannot be
                    # called meaningfully with no arguments
                    obj = cls.__new__(cls)
                    obj.items = list(self.items)
                    return obj

        In Rdataset, which subclasses SimpleSet:

                def copy(self):
                    obj = super(Rdataset, self).copy()
                    obj.rdclass = self.rdclass
                    obj.rdtype = self.rdtype
                    obj.covers = self.covers
                    obj.ttl = self.ttl
                    return obj

I've also noticed that if I just make SimpleSet subclass list instead
of having an "items" element, then "the right thing" happens with
copy.copy().  I'm a little leery of subclassing the base types,
because although I've done it to good effect sometimes, I've also had
it cause odd problems because the built-in types behave a little
differently than new-style classes in some cases.  Also, at least in
this case, it fails the is-a test.  A set is *not* a list; the fact
that I'm using a list is an implementation detail that I might not
want to expose.

So, what advice do Python experts have for this kind of situation?
Should I keep the first solution?  Should I subclass list in spite of
my misgivings?  Is there some other, more elegant solution I've
missed?

Thanks in advance!

/Bob