[Tutor] Iterable Understanding
Stephen Nelson-Smith
sanelson at gmail.com
Fri Nov 13 18:58:30 CET 2009
I think I'm having a major understanding failure.
So having discovered that my Unix sort breaks on the last day of the
month, I've gone ahead and implemented a per log search, using heapq.
I've tested it with various data, and it produces a sorted logfile, per log.
So in essence this:
logs = [ LogFile( "/home/stephen/qa/ded1353/quick_log.gz", "04/Nov/2009" ),
LogFile( "/home/stephen/qa/ded1408/quick_log.gz", "04/Nov/2009" ),
LogFile( "/home/stephen/qa/ded1409/quick_log.gz", "04/Nov/2009" ) ]
Gives me a list of LogFiles - each of which has a getline() method,
which returns a tuple.
I thought I could merge iterables using Kent's recipe, or just with
heapq.merge()
But how do I get from a method that can produce a tuple, to some
mergable iterables?
for log in logs:
l = log.getline()
print l
This gives me three loglines. How do I get more? Other than while True:
Of course tuples are iterables, but that doesn't help, as I want to
sort on timestamp... so a list of tuples would be ok.... But how do I
construct that, bearing in mind I am trying not to use up too much
memory?
I think there's a piece of the jigsaw I just don't get. Please help!
The code in full is here:
import gzip, heapq, re
class LogFile:
def __init__(self, filename, date):
self.logfile = gzip.open(filename, 'r')
for logline in self.logfile:
self.line = logline
self.stamp = self.timestamp(self.line)
if self.stamp.startswith(date):
break
self.initialise_heap()
def timestamp(self, line):
stamp = re.search(r'\[(.*?)\]', line).group(1)
return stamp
def initialise_heap(self):
initlist=[]
self.heap=[]
for x in xrange(10):
self.line=self.logfile.readline()
self.stamp=self.timestamp(self.line)
initlist.append((self.stamp,self.line))
heapq.heapify(initlist)
self.heap=initlist
def getline(self):
self.line=self.logfile.readline()
stamp=self.timestamp(self.line)
heapq.heappush(self.heap, (stamp, self.line))
pop = heapq.heappop(self.heap)
return pop
logs = [ LogFile( "/home/stephen/qa/ded1353/quick_log.gz", "04/Nov/2009" ),
LogFile( "/home/stephen/qa/ded1408/quick_log.gz", "04/Nov/2009" ),
LogFile( "/home/stephen/qa/ded1409/quick_log.gz", "04/Nov/2009" ) ]
More information about the Tutor
mailing list