[Tutor] find data in html file

lmac lopoff at gmx.net
Fri Sep 30 14:44:21 CEST 2005


Date: Wed, 28 Sep 2005 09:25:53 +0100
From: Ed Singleton <singletoned at gmail.com>
Subject: Re: [Tutor] find data in html file
To: tutor at python.org
Message-ID: <34bb7f5b0509280125208d435e at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On 27/09/05, lmac <lopoff at gmx.net> wrote:

>> Hi there,
>> i have a base-question. If i want to read some kind of data out of a line
>> which i know the start-tag and the end-tag in an html-file how do i
>> recognize
>> if it's more than one line ?
>>
>> Example:
>>
>> <td>Some text<a href>link</a>text ..... DATA ....</tr></td> etc.
>>
>> I would use >text as the starting tag to localize the beginning of the DATA.
>> And then </tr> as the ending tag of the DATA. But if there is \n then
>> there are more than
>> one line.
>  
>

Hopefully it's just a typo or something, but you appear to have your
ending </tr> and </td> tags the wrong way round.

You should be closing the cell before you close the row.

How do you want to get the data out?  This case is simple enough that
you could do a lazy (non-greedy) regex statement for it.  Something
like "<td>([\s|\S]+?)</td>" would do it.

Ed

It's not this simple. The whole thing is that i try to use ebay.de for fetching websites
when i give an articlenumber. The downloading of the site for a specific article is no problem.
But to get the data like price,bidders,shipment etc without the official eBayAPI is hard.
Maybe anyone has a solution made ?

Thanks anyway. I tried the htmllib. This is a very good lib but i don't get it to work cos
there is no <tag> thing for the data i want to get. This is for html-tags. And to store data
in my own XML-files. (what i am goint to do when i get the data).







More information about the Tutor mailing list