[Tutor] Introduction - log exercise
Antonio de la Fuente
toni at muybien.org
Tue Nov 17 23:38:26 CET 2009
* bob gailer <bgailer at gmail.com> [2009-11-17 15:26:20 -0500]:
> Date: Tue, 17 Nov 2009 15:26:20 -0500
> From: bob gailer <bgailer at gmail.com>
> To: Antonio de la Fuente <toni at muybien.org>
> CC: Python Tutor mailing list <tutor at python.org>
> Subject: Re: [Tutor] Introduction - log exercise
> User-Agent: Thunderbird (Windows/20090812)
> Message-ID: <4B0306EC.8000105 at gmail.com>
> Antonio de la Fuente wrote:
> >Hi everybody,
> >
> >This is my first post here. I have started learning python and I am new to
> >programing, just some bash scripting, no much. Thank you for the
> >kind support and help that you provide in this list.
> >
> >This is my problem: I've got a log file that is filling up very quickly, this
> >log file is made of blocks separated by a blank line, inside these blocks there
> >is a line "foo", I want to discard blocks with that line inside it, and create a
> >new log file, without those blocks, that will reduce drastically the size of the
> >log file.
> >
> >The log file is gziped, so I am going to use gzip module, and I am going to pass
> >the log file as an argument, so sys module is required as well.
> >
> >I will read lines from file, with the 'for loop', and then I will check them for
> >'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
> >list, and if there is no matches for foo, I will append line to the list. When I
> >get to a blank line (end of block), write myList to an external file. And start
> >with another line.
> >
> >I am stuck with defining 'blank line', I don't manage to get throught the while
> >loop, any hint here I will really appreciate it.
> >I don't expect the solution, as I think this is a great exercise to get wet
> >with python, but if anyone thinks that this is the wrong way of solving the
> >problem, please let me know.
> >
> >
> >#!/usr/bin/python
> >
> >import sys
> >import gzip
> >
> >myList = []
> >
> ># At the moment not bother with argument part as I am testing it with a
> ># testing log file
> >#fileIn = gzip.open(sys.argv[1])
> >
> >fileIn = gzip.open('big_log_file.gz', 'r')
> >fileOut = open('outputFile', 'a')
> >
> >for line in fileIn:
> > while line != 'blank_line':
> > if line == 'foo':
> > Somehow re-initialise myList
> > break
> > else:
> > myList.append(line)
> > fileOut.writelines(myList)
> Observations:
> 0 - The other responses did not understand your desire to drop any
> paragraph containing 'foo'.
Yes, paragraph == block, that's it
> 1 - The while loop will run forever, as it keeps processing the same line.
Because the tabs in the line with foo?!
> 2 - In your sample log file the line with 'foo' starts with a tab.
> line == 'foo' will always be false.
So I need first to get rid of those tabs, right? I can do that with
line.strip(), but then I need the same formatting for the fileOut.
> 3 - Is the first line in the file Tue Nov 17 16:11:47 GMT 2009 or blank?
First line is Tue Nov 17 16:11:47 GMT 2009
> 4 - Is the last line blank?
last line is blank.
> Better logic:
I would have never thought this way of solving the problem. Interesting.
> # open files
> paragraph = []
> keep = True
> for line in fileIn:
> if line.isspace(): # end of paragraph
Aha! finding the blank line
> if keep:
> outFile.writelines(paragraph)
> paragraph = []
This is what I called re-initialising the list.
> keep = True
> else:
> if keep:
> if line == '\tfoo':
> keep = False
> else:
> paragraph.append(line)
> # anticipating last line not blank, write last paragraph
> if keep:
> outFile.writelines(paragraph)
> # use shutil to rename
Thank you.
> --
> Bob Gailer
> Chapel Hill NC
> 919-636-4239
Antonio de la Fuente MartÃnez
E-mail: toni at muybien.org
The problem with people who have no vices is that generally you can
be pretty sure they're going to have some pretty annoying virtues.
-- Elizabeth Taylor
More information about the Tutor
mailing list