-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
I'm currently working on a PoC with twisted, Python, to prove the technology as an alternative to more established enterprise choices (java app servers, etc..).
the question is: if I have N number of processes running in a M number of machines, given that there are no network restriction, and that at least http and hhtps are always available, how these services would be efficiently monitored?
I've been planning to blog(done [1]) about this. What I do is to have the service listen to a http connection that returns some monitoring data. Something like the one below. Then I use a custom Nagios plugin to request this url and Nagios-PNP to make graphs of whatever datapoints the monitor produces. The plugin setup is done by providing an url (say http://myservice:893/monitor) to monitor. The plugin will then both monitor the application and graph it. I have also added an extra field for "errors" that I use to report odd exceptions or other types of failures that are non-fatal but should be investigated. If the error count exceeds a set level the may also go into the reporting. http://www.kraken.no/blog/?q=node/9 Regards, Tarjei class Monitor(rend.Page): """ Basic monitoring interface todo: make it even more general. Format: <itemname value="int" critical="int" warning="int" unit="UOM" label="itemname" min="0" max="300" /> UOM (unit of measurement) is one of: no unit specified - assume a number (int or float) of things (eg, users, processes, load averages) s - seconds (also us, ms) % - percentage B - bytes (also KB, MB, TB) c - a continous counter (such as bytes transmitted on an interface) """ def __init__(self, config): self.isLeaf = True self.config = config def renderHTTP(self, ctx): inevow.IRequest(ctx).setHeader('Content-Type', 'text/xml; charset=UTF-8') #que_length = str(len(getter.get_ids(0, -1))) #num_updated = str(get_nr_updated_last_day(self.config.get_db())) num_errors = str(len(self.config.getErrors())) _root = ET.Element('status', {'service' : 'PDFIndexer'}) if 'total' in self.config.stats: _doc = ET.SubElement(_root,'total', {'value': str(self.config.stats['total']) }) _doc = ET.SubElement(_root,'runnerStatus', {'value': str(self.config.checksStatus()) }) #_doc = ET.SubElement(_root,'itemsAddedlast24', {'value': num_updated}) _errors = ET.SubElement(_root,'errors', {'value': num_errors, 'critical': "3" , 'warning': "2"}) for error in self.config.getErrors(): t = ET.SubElement(_errors, 'error', text = error) _xmlcontainer = StringIO() ET.ElementTree(_root).write(_xmlcontainer, encoding="UTF-8") return _xmlcontainer.getvalue()
is there a twisted application/plugin/framework to do so? are you just create 'polling' requests that an existing monitoring tool (i.e. Nagios) parses and interprets? How would you build a "console" to manage services status? In a nutshell, what I'm trying to find out if there is already around a "container-like" twisted based application that can be used to manage/monitor twisted applications provided that they adhere to a given interface
Thanks for your help
Micc
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJom29YVRKCnSvzfIRAo0XAKCCOhwp1qIi0Qx2Dae0hn5gHkt5rwCfc8fn XYXuTUrB2c/eR8ex870rXAc= =pqNc -----END PGP SIGNATURE-----