polymorphjsm &c (was Re: I come to praise .join, not to bury it...)

Alex Martelli aleaxit at yahoo.com
Tue Mar 6 05:02:12 EST 2001


"Russell E. Owen" <owen at astrono.junkwashington.emu> wrote in message
news:9811a8$k5k$1 at nntp6.u.washington.edu...
    [snip]
> > (long discussion about polymorphism omitted).
>
> I'm afraid I don't understand where the polymorphism comes in, so I
> cannot comment on that.

If you don't understand polymorphism in general, then I don't
see how you can comment on any OO design.

If you're instead saying that you don't see where polymorphism
can help for this _specific_ case, what's so mysterious about
the specific example I gave?  Here it is again, more or less,
with a bit more fluff around it:

good_things = (
    'money', 'bridge', 'champagne',
    'wine', 'women', 'song'
    )
def joined_niceties(things, joiner=', '):
    nice_things = [thing for thing in things if thing in good_things]
    return joiner.join(nice_things)

>>> words = 'programming money bridge spam champagne'.split()
>>> print joined_niceties(words)
money, bridge, champagne
>>> print joined_niceties(words, ' - ')
money - bridge - champagne

class MixedJoiner:
    def __init__(self, joiner1, joiner2):
        self.joiner1 = joiner1
        self.joiner2 = joiner2
    def join(self, sequence):
        return joiner2.join((
            joiner1.join(sequence[:-1]),
            sequence[-1]))

>>> print joined_niceties(words, MixedJoiner(', ', ' and '))
money, bridge and champagne


We have some code (here, function joined_niceties) that somehow
prepares or obtains a sequence (here, nice_things, prepared by
selection from an argument to the function via membership in a
global-data list) and must return the string obtained by joining
said sequence with an arbitrary joiner (here, argument joiner,
defaulting to ", " -- comma-then-blank, often a decent way to
join words into a string).

Thanks to the fully general polymorphism that joiner.join affords,
we can then use a *special* joiner-object to do fancy joining.

This imposes NO technical cost whatsoever on the code that does
the joining (here, function joined_niceties) -- indeed, such code
can have been written months ago, before we even had an inkling
that the joiner-object would ever be other than a string.  This is
the common magic of polymorphism -- code can be closed to changes
yet open for modification (here, this applies to joined_niceties --
we need not change it to obtain fancy-joining, it all comes from
free thanks to the polymorphism on the joiner-object!).

So, we code a suitable joiner-object -- here, one that conjoins
two other joiners, using one for all but the last join, the other
for the last join only.  This use, of course, easily generalizes:

>>> commas_then_and = MixedJoiner(", ", " and ")
>>> commas_and_even = MixedJoiner(commas_then_and, " or even ")
>>> words = 'women money work bridge beans wine champagne spam'.split()
>>> print joined_niceties(words, commas_and_even)
women, money, bridge and wine or even champagne

without any further complexity, all through the magic of general
polymorphism.  Polymorphism is what's _truly_ great about object
oriented programming (and can also be obtained via some other
paradigms, such as generic-programming, with different costs and
advantages in various programming languages) -- getting the RIGHT
behaviour for a given specific case, from GENERAL client-code that
just appropriately asks one or more objects to "do the right
thing for purpose X" (by calling the appropriate methods with
the right arguments).


> >Say that I'm implementing a sequence-like object.  Would
> >having .join as a part of the set of methods I must write
> >be an _advantage_ to me?  As things stand now, to "be a
    [snip]
> I'm afraid we disagree here. Collections can all inherit from a parent
> collection class, a perfectly reasonable requirement. You then define
> the few methods that are required to make your own kind of collection.
> The parent class should define things like join (which is, of course, a
> very simple method).

So, there is NO advantage in having .join as a part of my
set of methods -- there IS the cost that I have to inherit
from some base-class, which may be 'perfectly reasonable'
(or not) but still IS a cost I have to pay, compared with
the present situation where no inheritance is REQUIRED.

For example, if I want to implement a sequence-like object
in a C-coded extension to Python (hardly an unreasonable
need!), then .join, and any other convenience methods that
may be required to "be a sequence", are PURE OVERHEAD.  It
makes my programming task harder; and, *where are the
benefits* corresponding to this cost?!  I have seen no
technical benefits claimed yet for making .join a method
of the sequence-like object -- in what use cases would the
resulting polymorphism be a functional _benefit_, to have
the sequence behave differently when it is to be joined
rather than (say) when it's to be written to a file-like
object (via the .writelines method of the f-l obj) or
otherwise iterated upon?

There's also some issue here with 'Collection' (a more
general concept than 'Sequence' -- why should we constrain
ALL collections to be sequenceable?!), but let's let that
pass for the moment.

Mixin base-classes are a good way to add convenience
methods to my own objects through the design pattern
'template' (as named by the Gof4 -- no relation to such
things as C++ templates, etc).  We have no disagreement
on THAT -- *IF* a method (that is generally suitable for
implementation via 'template' dp) is appropriate, then
mixin inheritance is a good implementation approach.
We DO have disagreement that .join is at all appropriate
as a method of sequences -- let alone more general
collections!!!

Furthermore: PLEASE show us how you would implement
the 'very simple method' that you claim the base
class .join would be.  Be sure, at the very least, to
handle both joiners that are single-byte strings, and
ones that are unicode-strings, assuming of course they
have no 'join' methods themselves.  It is also very
important that your .join's performance characteristics
be *reasonable* -- O(N) in time and space, where N
is the length of the resulting string; O(N*N) would
be a major disaster, as I hope you'll agree (a design
that forces an O(N) algorithm to become O(N*N) IS an
intrinsic disaster -- no aesthetic justification can
make up for that...:-).


Alex






More information about the Python-list mailing list