[Tutor] RE: [Tutor] Re: One other thing·... (and more)

Tim Peters tutor@python.org
Sun, 13 May 2001 04:18:39 -0400


[Stephen L Arnold]
> I'm just starting with Python, so I'd also like to know how this
> works too.  Sorry it's kinda long, but I inserted some example code
> from a binary search tree.  I'd like to know (in general) how this
> kind of thing works in Python.  The syntax of the code below is
> very readable (though a bit more verbose than Python).  Comments
> are preceded by two dashes.
>
> Whether you want to think in terms of objects and methods or
> generic packages and generic formal parameters (it least that's the
> way I understand it) there must be some way to define the set of
> valid operations, arguments, etc.  In Ada, this is defined in the
> package specification.  You don't need (or want) to know how the
> code in the package body is implemented, unless of course, you're
> the one doing the implementing (it just needs to conform to the
> spec).

In Python this is a matter of convention and documentation:  there are no
formal means for defining interfaces or protocols.  For example, if a name in
a module begins with an underscore, by convention it's private to the module
and you should leave it alone -- unless you want to live dangerously, in
which case Python isn't going to stop you.  Guido likens Python's notion of
protection to bicycle locks in Amsterdam:  they're advisory <wink>.

>    Overflow     :  exception;  -- Raised when tree space runs out.

Python always raises MemoryError when it runs out of memory, so nobody ever
documents that.  OverflowError is reserved for numeric operations whose
results "don't fit" in the target numeric type.

>    Key_Error    :  exception;  -- Raised for bogus key operations.
>    State_Error  :  exception;  -- Raised for more than one concurrent
>                                -- traversal, or state change during
>                                -- a traversal.

You can define all the exception classes you like.  Note that here you can
only explain them via comments; in Python you can also give classes
"docstrings" (ditto for functions and methods), and introspective tools can
obtain docstrings at runtime.

Without explanation, I'll just present a Python BST class; say this is in
file BST.py:

# BSTs use _Nodes under the covers.
class _Node:
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left
        self.right = right

class BST:
    """Binary Search Tree."""

    def __init__(self):
        self.root = _Node(None)

    # private helper
    # Search for value.
    # If found, return (1, p) where p.value == value.
    # Else return (0, p), where p is the last non-null node examined,
    # i.e. the node under which value should be inserted.
    def __find(self, value):
        p = self.root
        while p is not None: a
            parent = p
            c = cmp(value, p.value)
            if c < 0:
                p = p.left
            elif c > 0:
                p = p.right
            else:
                return 1, p
        return 0, parent

    def contains(self, value):
        "Return true if value in the tree, else false."
        found, p = self.__find(value)
        return found

    def insert(self, value):
        """Insert value into the tree.

        Raise KeyError if value is already in the tree.
        """

        found, p = self.__find(value)
        if found:
            raise KeyError("value %r aleady in tree" % value)
        new = _Node(value)
        if value < p.value:
            p.left = new
        else:
            p.right = new

    def traverse(self, visitor):
        """Do an in-order traversal of the tree.

        visitor is a one-argument function, called with each value
        as it's reached.
        """

        self.__inorder(self.root.left, visitor)
        self.__inorder(self.root.right, visitor)

    def __inorder(self, p, visitor):
        if p is not None:
            self.__inorder(p.left, visitor)
            visitor(p.value)
            self.__inorder(p.right, visitor)

That's a complete implementation, covering membership testing, insertion, and
inorder traversal using a visitor pattern.  Now let's say this is file
test.py:

import random
from BST import BST

t = BST()
for i in range(50):
    element = random.randrange(100)
    try:
        t.insert(element)
    except KeyError, msg:
        print "Oops!", msg

def printone(value):
    print value,
t.traverse(printone)
print

Running that will produce different results each time due to the use of the
"random" module; here's what a typical run prints:

Oops! value 24 aleady in tree
Oops! value 15 aleady in tree
Oops! value 91 aleady in tree
Oops! value 34 aleady in tree
Oops! value 30 aleady in tree
Oops! value 48 aleady in tree
Oops! value 25 aleady in tree
Oops! value 75 aleady in tree
1 5 6 9 15 16 17 21 22 24 25 27 29 30 32 34 36 41 42 43 47 48
 49 52 55 56 57 61 62 65 72 73 74 75 79 82 83 86 89 91 92 99

> I hope that's enough to give a fairly clear picture of what's going
> on (and isn't too confusing).

Not at all!  Ada is quite readable indeed.

> ...
> How does this kind of thing work in Python?  I guess you would
> define a binary_search_tree class, with methods like insert,
> delete, etc,

Bingo.

> exceptions,

Yes.  Here's an exception class:

class YourException(Exception):
    pass

The "(Execption)" business says YourException is a subclass of the builtin
Exception class, so inherits all the latter's behaviors.  "pass" is a
do-nothing placeholder, and in this context just means I don't want to
modify, add to, or override any of the normal Exception behavior.

> what parameters can be passed to the methods,

That's usually left to docstrings.  Various tools can be used to
auto-generate class docs from those (for example, pydoc in 2.1 does a nice
job of generating HTML docs from a module's docstrings).

> any overloaded operators, etc.  You would also have to specify
> (perhaps when you instantiate a tree object) what types are
> contained in the tree.

This is going to be hard for you to get used to:  Python couldn't care less.
I happened to build a BST with integer values above, but exactly the same
code works fine for BSTs with string values, or floating-point values, or
tuples, or arbitrary mixtures of floats, ints and strings, plus anything else
whatsoever Python knows how to compare; e.g., as an extreme example, the code
above works fine for a BST with function or class values (although
comparisons of those kinds of values return more-than-less meaningless
results!).

Types aren't very important in Python.  Rather than ask what type an object
is, you're far more often interested in "which methods it responds to".  For
example, many places in the standard Python library are documented as
accepting "a file" argument.  But, in practice, you can usually pass them any
object whatsoever that has a method named "write" or "read".  Until you get
comfortable enough with that to delight in exploiting it, you may be using
Python syntax but you'll still be coding in Ada <wink>.  You can, if you
like, document that values passed to a BST must enjoy a consistent total
ordering -- but everyone knows that already anyway <wink>.

> As a user of the tree class library, how do I find out what
> methods/parameters are valid, what the exception names are, etc?

Read the docs and the docstrings.  Even without a fancy doc auto-generation
tool, you do this "by hand" from an interactive command prompt:

>>> from BST import BST
>>> dir(BST)
['_BST__find', '_BST__inorder', '__doc__', '__init__', '__module__',
'contains', 'insert', 'traverse']
>>> print BST.contains.__doc__
Return true if value in the tree, else false.
>>>

Now I spent no more than 10 minutes writing the whole BST module, and I
certainly could have done a better job on the docs.  But because Python was
*designed* to be introspective, you can still find out a lot just by using
the builtin dir() function and trying to print docstrings for the things
dir() reveals.  The doc generation tools (like pydoc) are much more
convenient when modules get large and complicated, though.

brevity-is-a-virtue-ly y'rs  - tim