[Twisted-Python] deciding to use twisted or not
Hi! I would like to write a small daemon that monitors (tails) a server log, parses the entries and sends HTTP requests based on some of those entries. I would like it if the reading of the log file and the sending of http requests were asynchronous. Should I use twisted for this? Or is twisted overkill... Martin
On Wed, Aug 26, 2009 at 4:56 PM, Martin-Louis Bright <mlbright@gmail.com>wrote:
I would like to write a small daemon that monitors (tails) a server log, parses the entries and sends HTTP requests based on some of those entries. I would like it if the reading of the log file and the sending of http requests were asynchronous. Should I use twisted for this? Or is twisted overkill...
Twisted would be perfectly appropriate for this! The HTTP requests will certainly be non-blocking, assuming you use twisted.web.client (or the new, better thing which I hope will be released before we all get old). In fact, if anything, Twisted is under-kill; you'll need to go a bit further. The one minor issue is that Twisted has no explicit way to do asynchronous file I/O (because most operating systems provide a bewildering array of not-really-working systems for non-blocking file I/O, so it would be challenging for Twisted to do in a way that was useful). There are a number of ways to emulate it though; for log files (which are fairly low volume, and almost by definition will immediately be in your filesystem cache as the parts you want to read become available) you can just do a LoopingCall() which does .read() on a file that it leaves open to retrieve the next chunk of data, and that will probably be good enough. In practice you won't actually block doing this read() because your kernel is just going to pull the bytes directly out of filesystem cache memory and hand them back to you. It is of course possible to have to wait for the disk or even the network, depending on your underlying filesystem. I just filed http://twistedmatrix.com/trac/ticket/3983 for a more thorough solution (mostly in the hopes that somebody will close it as a duplicate and point to some pre-existing issue I couldn't find through search) so you can monitor future discussion there if you like.
On Wed, Aug 26, 2009 at 6:41 PM, Glyph Lefkowitz <glyph@twistedmatrix.com>wrote:
On Wed, Aug 26, 2009 at 4:56 PM, Martin-Louis Bright <mlbright@gmail.com>wrote:
I would like to write a small daemon that monitors (tails) a server log, parses the entries and sends HTTP requests based on some of those entries. I would like it if the reading of the log file and the sending of http requests were asynchronous. Should I use twisted for this? Or is twisted overkill...
Twisted would be perfectly appropriate for this! The HTTP requests will certainly be non-blocking, assuming you use twisted.web.client (or the new, better thing which I hope will be released before we all get old).
In fact, if anything, Twisted is under-kill; you'll need to go a bit further. The one minor issue is that Twisted has no explicit way to do asynchronous file I/O (because most operating systems provide a bewildering array of not-really-working systems for non-blocking file I/O, so it would be challenging for Twisted to do in a way that was useful). There are a number of ways to emulate it though; for log files (which are fairly low volume, and almost by definition will immediately be in your filesystem cache as the parts you want to read become available) you can just do a LoopingCall() which does .read() on a file that it leaves open to retrieve the next chunk of data, and that will probably be good enough. In practice you won't actually block doing this read() because your kernel is just going to pull the bytes directly out of filesystem cache memory and hand them back to you. It is of course possible to have to wait for the disk or even the network, depending on your underlying filesystem.
I just filed http://twistedmatrix.com/trac/ticket/3983 for a more thorough solution (mostly in the hopes that somebody will close it as a duplicate and point to some pre-existing issue I couldn't find through search) so you can monitor future discussion there if you like.
It would certainly be nice if Twisted supported async file io, but in this case wouldn't a ProcessProtocol around 'tail -f' be a good solution as well? -Cary
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- 01100011 01100001 01110010 01111001
On Wed, Aug 26, 2009 at 9:54 PM, Cary Hull <cary.hull@gmail.com> wrote:
It would certainly be nice if Twisted supported async file io, but in this case wouldn't a ProcessProtocol around 'tail -f' be a good solution as well?
That could work, but there are a few potential issues. 'tail' does slightly different stuff on different platforms. Maybe you're on Windows and it isn't available. Maybe it mangles your output (I know that some coreutils tools try to be encoding-aware, I don't know if 'tail' is one). Maybe you want to get blocks of bytes off the end rather than lines, etc. Then you also need to worry about housekeeping for a subprocess, which always turns out to be a little trickier than you first expect.
Thanks! Your advice is much appreciated. martin On Wed, Aug 26, 2009 at 10:08 PM, Glyph Lefkowitz <glyph@twistedmatrix.com>wrote:
On Wed, Aug 26, 2009 at 9:54 PM, Cary Hull <cary.hull@gmail.com> wrote:
It would certainly be nice if Twisted supported async file io, but in this case wouldn't a ProcessProtocol around 'tail -f' be a good solution as well?
That could work, but there are a few potential issues. 'tail' does slightly different stuff on different platforms. Maybe you're on Windows and it isn't available. Maybe it mangles your output (I know that some coreutils tools try to be encoding-aware, I don't know if 'tail' is one). Maybe you want to get blocks of bytes off the end rather than lines, etc. Then you also need to worry about housekeeping for a subprocess, which always turns out to be a little trickier than you first expect.
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
If you're using a linux based system, you may have some luck setting up syslogger to forward logging packets to the remote ip address, and running the twisted daemon on the other box, and sending notifications if the heartbeat from the monitored machine stops. I was working on a project recently to set up a monitor a RFID based access control system for building, that runs on twisted/openwrt/linux, we used an approach like this. The link below shows a sample twisted python file that runs on a monitoring machine. http://github.com/derfred/doord/blob/cd300a1cde930c07cd13d98be3e45cb89df7980...
I am using linux, and I want the daemon to be as responsive as possible to log events, so I think I would rather have it sit on the same box as where the log is produced. (Perhaps I'm wrong about this?) So I'm going to try Cary's ProcessProtocol approach, and if that doesn't work, Glyph's LoopingCall with a read() approach. Thanks for the link. -martin On Thu, Aug 27, 2009 at 8:11 AM, Chris Adams <chris@stemcel.co.uk> wrote:
If you're using a linux based system, you may have some luck setting up syslogger to forward logging packets to the remote ip address, and running the twisted daemon on the other box, and sending notifications if the heartbeat from the monitored machine stops.
I was working on a project recently to set up a monitor a RFID based access control system for building, that runs on twisted/openwrt/linux, we used an approach like this.
The link below shows a sample twisted python file that runs on a monitoring machine.
http://github.com/derfred/doord/blob/cd300a1cde930c07cd13d98be3e45cb89df7980...
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Martin-Louis Bright <mlbright <at> gmail.com> writes:
I am using linux, and I want the daemon to be as responsive as possible to log
events, so I think I would rather have it sit on the same box as where the log is produced. (Perhaps I'm wrong about this?) So I'm going to try Cary's ProcessProtocol approach, and if that doesn't work, Glyph's LoopingCall with a read() approach.
You can also use pyinotify to watch your log file changes. http://trac.dbzteam.org/pyinotify Regards, Mikhail
PyInotify only allows you to detect file changes, leaving you with the task of asynchronously sending http requests. -martin On Thu, Aug 27, 2009 at 12:19 PM, Mikhail <termim@gmail.com> wrote:
Martin-Louis Bright <mlbright <at> gmail.com> writes:
I am using linux, and I want the daemon to be as responsive as possible
to log events, so I think I would rather have it sit on the same box as where the log is produced. (Perhaps I'm wrong about this?) So I'm going to try Cary's ProcessProtocol approach, and if that doesn't work, Glyph's LoopingCall with a read() approach.
You can also use pyinotify to watch your log file changes. http://trac.dbzteam.org/pyinotify
Regards, Mikhail
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
I've handled this problem 2 ways: 1) for almost realtime... using twisted and .read() file as glyph mentioned and 2) used splunk and it's functionality to send search "matching" data to a program that in turn does http notification. This is at 5 min search intervals. As previous posters have mentioned, tail's behavior is inconsistent on different platforms. If your OS platform never changes then you could use tail -f as a process protocol. I originally started doing my work using the process protocol and tail -f but needed the software to work on 3 versions of linux and os x. The read() way of doing it was ultimately the most cross platform way I could come up with. Good luck. -rob On Tue, Sep 1, 2009 at 8:20 AM, Martin-Louis Bright<mlbright@gmail.com> wrote:
PyInotify only allows you to detect file changes, leaving you with the task of asynchronously sending http requests.
-martin
On Thu, Aug 27, 2009 at 12:19 PM, Mikhail <termim@gmail.com> wrote:
Martin-Louis Bright <mlbright <at> gmail.com> writes:
I am using linux, and I want the daemon to be as responsive as possible to log
events, so I think I would rather have it sit on the same box as where the log is produced. (Perhaps I'm wrong about this?) So I'm going to try Cary's ProcessProtocol approach, and if that doesn't work, Glyph's LoopingCall with a read() approach.
You can also use pyinotify to watch your log file changes. http://trac.dbzteam.org/pyinotify
Regards, Mikhail
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Hi ! You might want to look at my implementation of a tail -F : https://svn.sat.qc.ca/trac/miville/browser/trunk/py/miville/utils/tail.py Let me know if you have better than that. a 2009/9/1 Rob Hoadley <hoadley@gmail.com>:
I've handled this problem 2 ways: 1) for almost realtime... using twisted and .read() file as glyph mentioned and 2) used splunk and it's functionality to send search "matching" data to a program that in turn does http notification. This is at 5 min search intervals.
As previous posters have mentioned, tail's behavior is inconsistent on different platforms. If your OS platform never changes then you could use tail -f as a process protocol. I originally started doing my work using the process protocol and tail -f but needed the software to work on 3 versions of linux and os x. The read() way of doing it was ultimately the most cross platform way I could come up with. Good luck.
-rob
On Tue, Sep 1, 2009 at 8:20 AM, Martin-Louis Bright<mlbright@gmail.com> wrote:
PyInotify only allows you to detect file changes, leaving you with the task of asynchronously sending http requests.
-martin
On Thu, Aug 27, 2009 at 12:19 PM, Mikhail <termim@gmail.com> wrote:
Martin-Louis Bright <mlbright <at> gmail.com> writes:
I am using linux, and I want the daemon to be as responsive as possible to log
events, so I think I would rather have it sit on the same box as where the log is produced. (Perhaps I'm wrong about this?) So I'm going to try Cary's ProcessProtocol approach, and if that doesn't work, Glyph's LoopingCall with a read() approach.
You can also use pyinotify to watch your log file changes. http://trac.dbzteam.org/pyinotify
Regards, Mikhail
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Alexandre Quessy http://alexandre.quessy.net http://www.puredata.info/Members/aalex
Brilliant! Thank you. -martin On Sun, Sep 6, 2009 at 10:46 AM, Alexandre Quessy<listes@sourcelibre.com> wrote:
Hi ! You might want to look at my implementation of a tail -F : https://svn.sat.qc.ca/trac/miville/browser/trunk/py/miville/utils/tail.py Let me know if you have better than that. a
2009/9/1 Rob Hoadley <hoadley@gmail.com>:
I've handled this problem 2 ways: 1) for almost realtime... using twisted and .read() file as glyph mentioned and 2) used splunk and it's functionality to send search "matching" data to a program that in turn does http notification. This is at 5 min search intervals.
As previous posters have mentioned, tail's behavior is inconsistent on different platforms. If your OS platform never changes then you could use tail -f as a process protocol. I originally started doing my work using the process protocol and tail -f but needed the software to work on 3 versions of linux and os x. The read() way of doing it was ultimately the most cross platform way I could come up with. Good luck.
-rob
On Tue, Sep 1, 2009 at 8:20 AM, Martin-Louis Bright<mlbright@gmail.com> wrote:
PyInotify only allows you to detect file changes, leaving you with the task of asynchronously sending http requests.
-martin
On Thu, Aug 27, 2009 at 12:19 PM, Mikhail <termim@gmail.com> wrote:
Martin-Louis Bright <mlbright <at> gmail.com> writes:
I am using linux, and I want the daemon to be as responsive as possible to log
events, so I think I would rather have it sit on the same box as where the log is produced. (Perhaps I'm wrong about this?) So I'm going to try Cary's ProcessProtocol approach, and if that doesn't work, Glyph's LoopingCall with a read() approach.
You can also use pyinotify to watch your log file changes. http://trac.dbzteam.org/pyinotify
Regards, Mikhail
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Alexandre Quessy http://alexandre.quessy.net http://www.puredata.info/Members/aalex
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (7)
-
Alexandre Quessy
-
Cary Hull
-
Chris Adams
-
Glyph Lefkowitz
-
Martin-Louis Bright
-
Mikhail
-
Rob Hoadley