[Tutor] Finding a specific line in a body of text

Robert Sjoblom robert.sjoblom at gmail.com
Mon Mar 12 02:56:36 CET 2012


I'm sorry if the subject is vague, but I can't really explain it very
well. I've been away from programming for a while now (I got a
daughter and a year after that a son, so I've been busy with family
matters). As such, my skills are definitely rusty.

In the file I'm parsing, I'm looking for specific lines. I don't know
the content of these lines but I do know the content that appears two
lines before. As such I thought that maybe I'd flag for a found line
and then flag the next two lines as well, like so:

if keyword in line:
  flag = 1
  continue
if flag == 1 or flag == 2:
  if flag == 1:
    flag = 2
    continue
  if flag == 2:
    list.append(line)

This, however, turned out to be unacceptably slow; this file is 1.1M
lines, and it takes roughly a minute to go through. I have 450 of
these files; I don't have the luxury to let it run for 8 hours.

So I thought that maybe I could use enumerate() somehow, get the index
when I hit keyword and just append the line at index+2; but I realize
I don't know how to do that. File objects doesn't have an index
function.

For those curious, the data I'm looking for looks like this:
5 72 88 77 90 92
18 80 75 98 84 90
81
12 58 76 77 94 96

There are other parts of the file that contains similar strings of
digits, so I can't just grab any digits I come across either; the only
thing I have to go on is the keyword. It's obvious that my initial
idea was horribly bad (and I knew that as well, but I wanted to first
make sure that I could find what I was after properly). The structure
looks like this (I opted to use \t instead of relying on the tabs to
getting formatted properly in the email):

\t\tkeyword=
\t\t{
5 72 88 77 90 92 \t\t}

-- 
best regards,
Robert S.


More information about the Tutor mailing list