[Tutor] Logfile multiplexing
Stephen Nelson-Smith
sanelson at gmail.com
Tue Nov 10 14:25:55 CET 2009
Hi Kent,
> One error is that the initial line will be the same as the first
> response from getline(). So you should call getline() before trying to
> access a line. Also you may need to filter all lines - what if there
> is jitter at midnight, or the log rolls over before the end.
Well ultimately I definitely have to filter two logfiles per day, as
logs rotate at 0400. Or do you mean something else?
> More important, though, you are pretty much writing your own iterator
> without using the iterator protocol. I would write this as:
> class LogFile:
> def __init__(self, filename, date):
> self.logfile = gzip.open(filename, 'r')
> self.date = date
> def __iter__(self)
> for logline in self.logfile:
> stamp = self.timestamp(logline)
> if stamp.startswith(date):
> yield (stamp, logline)
> def timestamp(self, line):
> return " ".join(self.line.split()[3:5])
Right - I think I understand that.
>From here I get:
import gzip
class LogFile:
def __init__(self, filename, date):
self.logfile = gzip.open(filename, 'r')
self.date = date
def __iter__(self):
for logline in self.logfile:
stamp = self.timestamp(logline)
if stamp.startswith(date):
yield (stamp, logline)
def timestamp(self, line):
return " ".join(self.line.split()[3:5])
l = LogFile("/home/stephen/access_log-20091105.gz", "[04/Nov/2009")
I get:
Python 2.4.3 (#1, Jan 21 2009, 01:11:33)
[GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import kent
>>> kent.l
<kent.LogFile instance at 0x2afb05142bd8>
>>> dir(kent.l)
['__doc__', '__init__', '__iter__', '__module__', 'date', 'logfile',
>>> for line in kent.l:
... print line
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "kent.py", line 10, in __iter__
stamp = self.timestamp(logline)
File "kent.py", line 15, in timestamp
return " ".join(self.line.split()[3:5])
AttributeError: LogFile instance has no attribute 'line'
>>> for stamp,line in kent.l:
... print stamp,line
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "kent.py", line 10, in __iter__
stamp = self.timestamp(logline)
File "kent.py", line 15, in timestamp
return " ".join(self.line.split()[3:5])
AttributeError: LogFile instance has no attribute 'line'
>>> for stamp,logline in kent.l:
... print stamp,logline
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "kent.py", line 10, in __iter__
stamp = self.timestamp(logline)
File "kent.py", line 15, in timestamp
return " ".join(self.line.split()[3:5])
AttributeError: LogFile instance has no attribute 'line'
> You are reading through the entire file on load because your timestamp
> check is failing. You are filtering out the whole file and returning
> just the last line. Check the dates you are supplying vs the actual
> data - they don't match.
Yes, I found that out in the end! Thanks!
More information about the Tutor
mailing list