[FEEDBACK] Is this script efficient...is there a better way?
Sean 'Shaleh' Perry
shalehperry at attbi.com
Wed Sep 11 17:53:23 EDT 2002
On Wednesday 11 September 2002 14:01, Bob X wrote:
>
> # build list of keywords
> kw = [ "some", "words" ]
>
> # loop through the list and print the lines to a file
> for line in inFile.readlines():
> for badword in kw:
> if line.find(badword) > -1:
> result = '%s %s' % (badword, line)
> print result # Print the result
> outFile.write(result) # Write the result
>
> # close the files
> inFile.close()
> outFile.close()
>
> # let me know when it's done
> print "Finished processing file..."
1) readlines() loads the entire file into a list so if you have a 30+ mb file
you just ate 30+mb of memory. Try using xreadlines() instead, it reads the
file line by line and is much more memory friendly.
2) do you expect to find more than one keyword in a particular line? If not
you could save some iterations by stopping the inner line.find() loop as soon
as one item is found.
As a final comment you need to be aware that the more keywords you look for
the slower this will be. However there is not a way to get around that, it
is just something to keep in mind.
More information about the Python-list
mailing list