[Tutor] Searching in a file

Hugo Arts hugo.yoshi at gmail.com
Wed Jan 13 22:20:01 CET 2010


On Wed, Jan 13, 2010 at 6:49 PM, Paul Melvin
<paul at assured-networks.co.uk> wrote:
> Hi,
>
> I have a file generated from a webpage.
>
> I want to search that file for a specific keyword, in my case 'NEW'.
>
> Once I have found that keyword I want to retrieve information below it, e.g.
> web link, size of file etc.
>
> When I have this information I move off until I find another 'NEW' and the
> process starts all over.
>
> Please can someone give me some pointers on how to do this.
>
> I can find the line containing 'NEW' which I get using a simple for loop on
> every line in the file, but how do I then traverse the next x amount of
> lines, taking what I want until either the next 'NEW' or eof.
>
> e.g.
>
> for line in file:
>        if re.findall('NEW', line)      # or search
>                list.append(line)               # to do something to later
>
> I cannot 'get' to the following lines because I would need to get out of the
> loop.
>

the most obvious answer would be to take a look at the 'next()'
function, that should solve this immediate problem.

Another approach is to set a variable, foundnew = True, after you have
found the word 'NEW', and continue. Then, in your loop, if foundnew is
True, you could process the line for your additional data. Then, after
processing, foundnew could be set back to false.

Yet another approach is to abandon the for loop entirely, and use a
while loop combined with the readline method, which allows you to read
lines wherever you like. This is a similar solution to the one using
next() above.

I would personally prefer solutions one or three. number two seems
like it could get complicated when you need to process multiple lines
of information after every NEW.

Hugo


More information about the Tutor mailing list