[Twisted-Python] Where to start: log reader/analysis
Dear Twisted Experts (... meant in a nice way :) ) I'm not sure where to start. I need to write a small server that: - reads lines in a log file as they are appended - reads input from a socket as it becomes available - does an analysis of both (like, what time was input received in the log, and the output received via the socket) - outputs a summary report Socket I/O is easy - but I'm not sure how to include file reading ... its bound to be easy. Any tips? Thanks :) Andrew
Hi Andrew, I wrote a class that follows a file (eg. log file) and provides an iterator to walk through it. Don't know if it may be of any use for you (or others). class FileFollower(object): """Iterate through a file while it is updated. >>> file = FileFollower("/tmp/testfile") >>> file.interval = 5 >>> for line in file: ... print line """ interval = 1 def __init__(self, filename, interval=None): self.filename = filename self.interval = interval or self.interval self.stat = None self.offset = 0 self.lines = [] self.running = True # # File following def follow(self): while self.running: if self.hasChanged(): data = self.readChange() if data: self.dataReceived(data) break time.sleep(self.interval) def hasChanged(self): stat = os.stat(self.filename) if stat != self.stat: self.stat = stat return True return False def readChange(self): file = open(self.filename) file.seek(self.offset) data = file.read() self.offset = file.tell() file.close() return data # # Data buffering def dataReceived(self, data): lines = data.split(os.linesep) lines = lines[:-1] for line in lines: self.lineReceived(line) def lineReceived(self, line): self.lines.append(line) # # Iterator implementation def __iter__(self): return self def next(self): if not self.lines: self.follow() line = self.lines.pop(0) return line 2007/8/5, Andrew E <andrew@ellerton.net>:
Dear Twisted Experts (... meant in a nice way :) )
I'm not sure where to start.
I need to write a small server that:
- reads lines in a log file as they are appended - reads input from a socket as it becomes available - does an analysis of both (like, what time was input received in the log, and the output received via the socket) - outputs a summary report
Socket I/O is easy - but I'm not sure how to include file reading ... its bound to be easy.
Any tips?
Thanks :)
Andrew
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On Mon, 6 Aug 2007 10:57:19 +0200, Yoann Aubineau <yoann.aubineau@wengo.com> wrote:
Hi Andrew,
I wrote a class that follows a file (eg. log file) and provides an iterator to walk through it. Don't know if it may be of any use for you (or others).
Hi Yoann, thanks for sharing.
class FileFollower(object): """Iterate through a file while it is updated.
file = FileFollower("/tmp/testfile") file.interval = 5 for line in file: ... print line """
interval = 1
def __init__(self, filename, interval=None): self.filename = filename self.interval = interval or self.interval self.stat = None self.offset = 0 self.lines = [] self.running = True
# # File following
def follow(self): while self.running: if self.hasChanged(): data = self.readChange() if data: self.dataReceived(data) break time.sleep(self.interval)
def hasChanged(self): stat = os.stat(self.filename) if stat != self.stat: self.stat = stat return True return False
def readChange(self): file = open(self.filename) file.seek(self.offset) data = file.read() self.offset = file.tell() file.close() return data
# # Data buffering
def dataReceived(self, data): lines = data.split(os.linesep) lines = lines[:-1] for line in lines: self.lineReceived(line)
def lineReceived(self, line): self.lines.append(line)
# # Iterator implementation
def __iter__(self): return self
def next(self): if not self.lines: self.follow() line = self.lines.pop(0) return line
In order to make this class more usable within a Twisted application, I'd make a few suggestions: Separate the transport from the protocol. All of the methods in the area commented "file following" are basically transport methods: they know how to get the underlying bytes (by polling and eventually reading). The protocol implementation is basically the dataReceived and lineReceived methods. With separation between the transport and the protocol, you don't even need to implement these, since you can just use LineReceiver from twisted.protocols.basic. Do the polling in a cooperative way. Using an infinite for loop and a time.sleep call has the consequence of tying up an entire thread. This means nothing else can happen unless you run the follow method of this class in a new, dedicated thread. If you use the reactor to schedule the checks instead, then this can be used alongside other Twisted code without having to deal with threading. twisted.internet.task.LoopingCall might be of particular interest. Jean-Paul
Like this? from twisted.internet import task,reactor,abstract from twisted.protocols import basic import os class FileFollowerTransport(object): """Iterate through a file while it is updated. """ interval = 1 def __init__(self, filename, interval=None): self.filename = filename self.interval = interval or self.interval self.stat = None self.offset = 0 self.lines = [] self.lc = task.LoopingCall(self.follow) # # File following def run(self): self.lc.start(self.interval) def follow(self): if self.hasChanged(): data = self.readChange() if data: self.protocol.dataReceived(data) def hasChanged(self): stat = os.stat(self.filename) if stat != self.stat: self.stat = stat return True return False def readChange(self): file = open(self.filename) file.seek(self.offset) data = file.read() self.offset = file.tell() file.close() return data class stuby(basic.LineReceiver): def dataReceived(self, data): print data if __name__ == '__main__': l = stuby() f = FileFollowerTransport('test') f.protocol = l f.run() l.makeConnection(f) reactor.run() On 8/6/07, Jean-Paul Calderone <exarkun@divmod.com> wrote:
On Mon, 6 Aug 2007 10:57:19 +0200, Yoann Aubineau < yoann.aubineau@wengo.com> wrote:
Hi Andrew,
I wrote a class that follows a file (eg. log file) and provides an iterator to walk through it. Don't know if it may be of any use for you (or others).
Hi Yoann, thanks for sharing.
class FileFollower(object): """Iterate through a file while it is updated.
file = FileFollower("/tmp/testfile") file.interval = 5 for line in file: ... print line """
interval = 1
def __init__(self, filename, interval=None): self.filename = filename self.interval = interval or self.interval self.stat = None self.offset = 0 self.lines = [] self.running = True
# # File following
def follow(self): while self.running: if self.hasChanged(): data = self.readChange() if data: self.dataReceived(data) break time.sleep(self.interval)
def hasChanged(self): stat = os.stat(self.filename) if stat != self.stat: self.stat = stat return True return False
def readChange(self): file = open(self.filename) file.seek(self.offset) data = file.read() self.offset = file.tell() file.close() return data
# # Data buffering
def dataReceived(self, data): lines = data.split(os.linesep) lines = lines[:-1] for line in lines: self.lineReceived(line)
def lineReceived(self, line): self.lines.append(line)
# # Iterator implementation
def __iter__(self): return self
def next(self): if not self.lines: self.follow() line = self.lines.pop(0) return line
In order to make this class more usable within a Twisted application, I'd make a few suggestions:
Separate the transport from the protocol. All of the methods in the area commented "file following" are basically transport methods: they know how to get the underlying bytes (by polling and eventually reading). The protocol implementation is basically the dataReceived and lineReceived methods. With separation between the transport and the protocol, you don't even need to implement these, since you can just use LineReceiver from twisted.protocols.basic.
Do the polling in a cooperative way. Using an infinite for loop and a time.sleep call has the consequence of tying up an entire thread. This means nothing else can happen unless you run the follow method of this class in a new, dedicated thread. If you use the reactor to schedule the checks instead, then this can be used alongside other Twisted code without having to deal with threading. twisted.internet.task.LoopingCall might be of particular interest.
Jean-Paul
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (4)
-
Andrew E
-
Jean-Paul Calderone
-
Nathaniel Haggard
-
Yoann Aubineau