[FEEDBACK] Is this script efficient...is there a better way?
Steve Holden
sholden at holdenweb.com
Wed Sep 11 20:23:38 EDT 2002
"Bob X" <bobx at linuxmail.org> wrote ...
> I am a newbie at this and this is a script I have come up with. I am
> looking for pointers on any "newbie" gotchas or ways to do it better.
>
> I use this to parse a log file (30MB+) for keywords and write the lines
> that are found into another file.
>
> Any thoughts would be appreciated... :-)
>
> Bob
>
> #!/usr/local/bin/python -w
>
> # import the needed libs
> import sys, string
>
> # make sure the command line arguments are there
> if len(sys.argv) < 3:
> print "usage: fread.py [log file] [hit file]"
> sys.exit(1)
>
Unless the return is important, you'll find
sys.exit("usage: fread.py [log file] [hit file]")
more convenient.
> # open the files with some error checking
> try:
> inFile = open(sys.argv[1],"r")
> except IOError:
> print "Cannot open log file!\n"
> sys.exit(1)
>
> try:
> outFile = open(sys.argv[2],"w")
> except IOError:
> print "Cannot open hits file!\n"
> sys.exit(1)
>
Your error checking is exemplary. Normally I'm happy to have a Python
exception explain the error condition, but of course your explicit messages
are great.
> # build list of keywords
> kw = [ "some", "words" ]
>
> # loop through the list and print the lines to a file
> for line in inFile.readlines():
> for badword in kw:
> if line.find(badword) > -1:
> result = '%s %s' % (badword, line)
> print result # Print the result
> outFile.write(result) # Write the result
>
If you only wanted to know the lines that match at least one of the words
you could follow the write() with a "break", since there's no need to
continue the loop then.
Also, the "print" looks like a debugging statement.
> # close the files
> inFile.close()
> outFile.close()
>
> # let me know when it's done
> print "Finished processing file..."
>
Highly commendable. Another approach would be to build a regular expression
from the keywords list and search for the r.e. in each line. The following
code is untested...
kw = ["some", "words"]
kwpat = "|".join(kw)
pat = re.compile(kwpat)
for line in inFile.readlines():
if pat.search(line):
outFile.write(line)
for example.
regards
-----------------------------------------------------------------------
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/pwp/
Previous .sig file retired to www.homeforoldsigs.com
-----------------------------------------------------------------------
More information about the Python-list
mailing list