parsing a long text file for specific text
Tim Roberts
timr at probo.com
Wed Jan 30 02:37:27 EST 2002
"Jim Ragsdale" <overlord at netdoor.com> wrote:
>
>Before if I have done anything like this, I used a loop to check to see if
>it matched a piece of text, but this takes a while. Is there a better way?
>Any thoughts? Any comments would be appreciated, thanks!
Here's one comment. There are several ways to scan through all the lines
of a file. The most obvious is this:
f = open('filexxx','r')
for ln in f.readlines():
...
The problem with this is that readlines() reads the ENTIRE file into a list
in memory, and then starts feeding them one at a time into the loop. This
tends to be the slowest method for long files.
One alternative is to read the file in smaller-sized chunks:
while 1:
chunk = f.readlines(100000)
if not chunk: break
for ln in chunk:
..
This performs much better because you've reduced memory thrashing, although
it's not as "pretty". Recently, another option was added:
for ln in f.xreadlines():
...
xreadlines reads the file one line at a time instead of all at once. In my
benchmarks, xreadlines usually comes out as the winner.
--
- Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the Python-list
mailing list