[FEEDBACK] Is this script efficient...is there a better way?

Steve Holden sholden at holdenweb.com
Wed Sep 11 20:23:38 EDT 2002


"Bob X" <bobx at linuxmail.org> wrote ...
> I am a newbie at this and this is a script I have come up with. I am
> looking for pointers on any "newbie" gotchas or ways to do it better.
>
> I use this to parse a log file (30MB+) for keywords and write the lines
> that are found into another file.
>
> Any thoughts would be appreciated...  :-)
>
> Bob
>
> #!/usr/local/bin/python -w
>
> # import the needed libs
> import sys, string
>
> # make sure the command line arguments are there
> if len(sys.argv) < 3:
>      print "usage: fread.py [log file] [hit file]"
>      sys.exit(1)
>
Unless the return is important, you'll find

    sys.exit("usage: fread.py [log file] [hit file]")

more convenient.

> # open the files with some error checking
> try:
>      inFile = open(sys.argv[1],"r")
> except IOError:
>      print "Cannot open log file!\n"
>      sys.exit(1)
>
> try:
>      outFile = open(sys.argv[2],"w")
> except IOError:
>      print "Cannot open hits file!\n"
>      sys.exit(1)
>
Your error checking is exemplary. Normally I'm happy to have a Python
exception explain the error condition, but of course your explicit messages
are great.

> # build list of keywords
> kw = [ "some", "words" ]
>
> # loop through the list and print the lines to a file
> for line in inFile.readlines():
>      for badword in kw:
>          if line.find(badword) > -1:
>              result = '%s %s' % (badword, line)
>              print result            # Print the result
>              outFile.write(result)   # Write the result
>
If you only wanted to know the lines that match at least one of the words
you could follow the write() with a "break", since there's no need to
continue the loop then.

Also, the "print" looks like a debugging statement.

> # close the files
> inFile.close()
> outFile.close()
>
> # let me know when it's done
> print "Finished processing file..."
>
Highly commendable. Another approach would be to build a regular expression
from the keywords list and search for the r.e. in each line. The following
code is untested...

kw = ["some", "words"]
kwpat = "|".join(kw)
pat = re.compile(kwpat)
for line in inFile.readlines():
    if pat.search(line):
        outFile.write(line)

for example.

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                 http://pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------






More information about the Python-list mailing list