Wild-eyed thinking aloud: Python System Management Infrastructure

William Annis annis at biostat.wisc.edu
Fri Aug 3 14:25:43 EDT 2001


        I've been a Unix system adminstrator for about 7 years now,
longer if you count my student admin jobs.  With a few, brief
exceptions, I have always worked places with small computing budgets.
We can't afford sexy system management packages which can sometimes
cost in the US$100,000s.  I mean things like Tivoli, CA Unicenter,
Bull's OpenMaster, etc.

        About once a year I run into some situation where I *want*
something like one of these tools.  Like many academic sites, we
started small with a handful of Unix machines and a few dozon users.
Now we have 100s of users and about 70 Unix machines all doing
different jobs.  Now I regularly have to rewrite or at least try to
fix the tools that worked with 15 machines but now fall down: a mass
of shell scripts, C and perl glommed together with duct tape, caffeine
and cheap, grad-student labor.

        It may all just make me crazy.  Our old user management system
(perl and some C) had a unique and beautiful problem.  It generated
/etc/passwd and /etc/group from a single data file.  An extra space at
the end of a line could cause some confusion in the parsing, and what
you sometimes got was a recreated /etc/passwd with the names and UIDs
slightly out of alignment.  It was a wonder to behold, a nightmare to
fix.  That system is gone now.

        As the years go by I keep reimplementing these things (if
we're lucky, we buy something) and I can't help but wonder if there
isn't some Better Way.  I keep writing these tools, but they can't
chat with each other.

        Recently I have found myself designing a database-driven
system to keep track of our machines, what they do, where they sit,
etc., and I keep thinking "William, you should write an
*infrastructure* for all this system junk you do."  I've been working
on a system monitoring tool for several years now, so I have some
experience in writing distributed systems that, so I already have some
ideas.  My thoughts right now lean toward a central communications
daemon/dispatcher attached to a database with various other daemons
devoted to specfic tasks:

    * keep track of machine data (my current worry)
    * user data (including fiscal data if necessary)
    * system monitoring, event and problem notification
    * for the adventurous, things like printer and job control, etc...
      (this should be a infrastructure, and it should be easy
      to plug in whatever you can think up)

Of course there are free various tools that do these sorts of things
out there, but they don't exactly play well together, if at all.  I
have ideas on how communication should work, and though I have avoided
the XML kool-aid up to now, I'm willing to concede XML-RPC might be
useful here.

        I'd rather not do this alone. So, if there are other Unix
admins, or NT admins, for that matter, who want to see their favorite
language used to develop something like this and are willing to do
some actual work, please contact me.  If there's no interest, I'll
just crawl back into my office and hush up. :)

-- 
William Annis - System Administrator - Biomedical Computing Group
annis at biostat.wisc.edu                       PGP ID:1024/FBF64031
Mi parolas Esperanton - La Internacian Lingvon  www.esperanto.org



More information about the Python-list mailing list