Extract information from HTML table

Ulysse maxime.p at gmail.com
Sun Apr 1 10:56:04 EDT 2007


On Apr 1, 2:52 pm, irs... at gmail.com wrote:
> On Apr 1, 3:13 pm, "Ulysse" <maxim... at gmail.com> wrote:
>
> > Hello,
>
> > I'm trying to extract the data from HTML table. Here is the part of
> > the HTML source :
>
> > ....
>
> > Do you know the way to do it ?
>
> Beautiful Soup is an easy way to parse HTML (that may be broken).http://www.crummy.com/software/BeautifulSoup/
>
> Here's a start of a parser for your HTML:
>
> soup = BeautifulSoup(txt)
> for tr in soup('tr'):
>     dateTd, textTd = tr('td')[1:]
>     print 'Date :', dateTd.contents[0].strip()
>     print textTd #element still needs parsing
>
> where txt is the string in your message.

I have seen the Beautiful Soup online help and tried to apply that to
my problem. But it seems to be a little bit hard. I will rather try to
do this with regular expressions...




More information about the Python-list mailing list