[Tutor] Searching in a file

Kent Johnson kent37 at tds.net
Fri Jan 15 13:56:06 CET 2010


On Fri, Jan 15, 2010 at 4:24 AM, Paul Melvin
<paul at assured-networks.co.uk> wrote:
> Hi,
>
> Thanks very much to all your suggestions, I am looking into the suggestions
> of Hugo and Alan.
>
> The file is not very big, only 700KB (~20000 lines), which I think should be
> fine to be loaded into memory?
>
> I have two further questions though please, the lines are like this:
>
>                                <img width="13" height="15" alt="NEW"
> src="/m/I/I/star.png" />
>                        <strong><a href="/browse/post/5354361/">Revenge
> (2011)</a></strong>
>
> </td>
> <td class="final">
>                        <span title="Exact date/time: 05-01-2011 23:08"
> class="ageVeryNew">5 days </span>
> </td>
> <td class="final">
>                        <span title="Exact date/time: 18-01-2011 16:06"
> class="ageVeryNew">65 minutes </span>
>
> Etc with a chunk (between each NEW) being about 60 lines, I need to extract
> info from these lines, e.g. /browse/post/5354361/ and Revenge (2011) to pass
> back to the output, is re the best option to get all these various bits,
> maybe a generic function that I pass the search strings too?

You might be better off using an HTML parser such as BeautifulSoup or lxml.

Kent


More information about the Tutor mailing list