[Tutor] Introduction - log exercise

Antonio de la Fuente toni at muybien.org
Tue Nov 17 17:58:08 CET 2009


Hi everybody,

This is my first post here. I have started learning python and I am new to
programing, just some bash scripting, no much. 
Thank you for the kind support and help that you provide in this list.

This is my problem: I've got a log file that is filling up very quickly, this
log file is made of blocks separated by a blank line, inside these blocks there
is a line "foo", I want to discard blocks with that line inside it, and create a
new log file, without those blocks, that will reduce drastically the size of the
log file. 

The log file is gziped, so I am going to use gzip module, and I am going to pass
the log file as an argument, so sys module is required as well.

I will read lines from file, with the 'for loop', and then I will check them for
'foo' matches with a 'while loop', if matches I (somehow) re-initialise the
list, and if there is no matches for foo, I will append line to the list. When I
get to a blank line (end of block), write myList to an external file. And start
with another line.

I am stuck with defining 'blank line', I don't manage to get throught the while
loop, any hint here I will really appreciate it.
I don't expect the solution, as I think this is a great exercise to get wet
with python, but if anyone thinks that this is the wrong way of solving the
problem, please let me know.


#!/usr/bin/python

import sys
import gzip

myList = []

# At the moment not bother with argument part as I am testing it with a
# testing log file
#fileIn = gzip.open(sys.argv[1])

fileIn = gzip.open('big_log_file.gz', 'r')
fileOut = open('outputFile', 'a')

for line in fileIn:
    while line != 'blank_line':
        if line == 'foo':
            Somehow re-initialise myList
	    break
        else:
            myList.append(line)
    fileOut.writelines(myList)


Somehow rename outputFile with big_log_file.gz

fileIn.close()
fileOut.close()

-------------------------------------------------------------

The log file will be fill with:


Tue Nov 17 16:11:47 GMT 2009
	bladi bladi bla
	tarila ri la
	patatin pataton
	tatati tatata

Tue Nov 17 16:12:58 GMT 2009
	bladi bladi bla
	tarila ri la
	patatin pataton
	foo
	tatati tatata

Tue Nov 17 16:13:42 GMT 2009
	bladi bladi bla
	tarila ri la
	patatin pataton
	tatati tatata


etc, etc ,etc
..............................................................

Again, thank you.

-- 
-----------------------------
Antonio de la Fuente Martínez
E-mail: toni at muybien.org
-----------------------------



More information about the Tutor mailing list