Wild-eyed thinking aloud: Python System Management Infrastructure

Sat Aug 4 14:38:00 EDT 2001

William Annis wrote:
> 
> Now I regularly have to rewrite or at least try to
> fix the tools that worked with 15 machines but now fall down: a mass
> of shell scripts, C and perl glommed together with duct tape, caffeine
> and cheap, grad-student labor.
[...]
>         Recently I have found myself designing a database-driven
> system to keep track of our machines, what they do, where they sit,
> etc., and I keep thinking "William, you should write an
> *infrastructure* for all this system junk you do."  
[...]
> Of course there are free various tools that do these sorts of things
> out there, but they don't exactly play well together, if at all.  I
> have ideas on how communication should work, and though I have avoided
> the XML kool-aid up to now, I'm willing to concede XML-RPC might be
> useful here.

XML is not kool-aid, it's an elixir: "good for what ails ye'" :-)

You are describing a system in which, I believe, XML is well 
suited to play a role.  Its primary advantage is as a standard for
inter-application communication, which is a problem at the heart
of your situation.  

Since the interfaces between components in a complex system have 
far more effect on its characteristics than do the individual 
components, start by addressing the design of the interfaces.  
This also allows you to partition the problem, which will be 
a necessary step in implementing something under your own 
constraints (cost, relatively unclear requirements, 
and not disrupting current operations).

I would start by making the decision to standardize on XML 
as a representation for *all* your various data sources and
sinks, whether that means storing the data in raw XML, or 
just providing converters to change to or from XML so you
can continue to work with your existing formats until the
process is "complete" (whatever that means).  This will 
initially feel like it will be more cumbersome than, say,
simple text files, but once you are able to dispose of
a dozen custom parsing routines, all bug-ridden whether you
know it or not, you will start to see benefits.

Once this step is accomplished, you could focus on specific
components or subsystems, one at a time.  Take the worst
problems and rewrite them as desired.  Use Python and its
XML support to smooth over existing rough areas.  Use it
to glue in (cheap) tools which can serve your purpose in
other areas, Python-based or not.  Migrate to using a web
interface for all your front-end needs, whether accepting
data input from you or users, or generating reports.  

Over time you will find you gradually eliminate the last
traces of entire technologies in your system.  You will
eventually rewrite or replace the last Perl script.  
You will no longer have shell scripts that aren't in
Python.  You will notice that your support costs have 
dropped dramatically as you cut out the overhead of
these additional technologies.  

You will then realize that for the first time you 
actually have enough capacity to think about 
_adding_ functionality for the first time in years.
You will start to clean up your backlog of "minor"
support requests, and also see that your todo list
has items dropping off it weekly as you realize
you needed to do them only as a side effect of the 
incredible complexity of your previous situation.
Life will be good and you'll remember you have
kids and need to eat regularly to survive.

This exercise in Utopythian thinking brought to 
you by the letters X, M and L. :-)

I guess my point is that I believe you could start
to solve many of your problems by the application
of a few basic principles:

 1. Partition the problem: in this case the problem
    is the whole system, and only by partitioning with
    a standard interlanguage (XML) can you break
    the problem down enough to begin effectively
    attacking individual pieces of it.  Until you
    do this, the problem remains too large to handle.

 2. Reduce overhead: in this case you probably are
    bogged down by having many times more technologies
    in use than you could get by with.  After
    partitioning, begin reducing the number of 
    redundant technologies.  You can also do this 
    by adding new technologies selectively, where
    they replace several older technologies or
    have reduced support costs.  Using the web
    as your sole interface (except for text editors
    for some internal work) is an example.
    (I'll never understand why aggressive 
    technology management is considered a Level 5
    practice in the CMM: it should be the first!)

 3. Short iterations: from XP, in this case since
    you have an almost intractable problem and 
    limited resources, you should continually set
    and achieve very small goals and constantly
    adapt your game plan.  The traditional approach
    of building another monolithic or all-encompassing
    infrastructure is so often doomed to fail that
    we should concede It Doesn't Work and try some 
    other techniques tuned for our persistent lack 
    of a clear idea of what we are really trying 
    to accomplish.

I'm rambling.  I'll leave now. :)

> --
> William Annis - System Administrator - Biomedical Computing Group
> annis at biostat.wisc.edu                       PGP ID:1024/FBF64031
> Mi parolas Esperanton - La Internacian Lingvon  www.esperanto.org

Do vi ankorau komprenas la avantaghojn de interlingvoj, 
kiel XML, chu ne?  Ne necesas, ke mi konvinku vin. :)

-- 
----------------------
Peter Hansen, P.Eng.
peter at engcorp.com