extracting html table rows into a list

Walter Dörwald walter at livinglogic.de
Thu Nov 22 12:16:41 EST 2001


damien Wetzel wrote:

> hi ,
> does any body has a script which parse a big table from an html file
> and create a list of rows ?


You could give XIST a try (http://www.livinglogic.de/Python/xist/)

Code might look like this:

from xist import parsers
from xist.ns import html

doc = parsers.parseTidyURL("http://www.freshmeat.net/", 
defaultEncoding="latin-1")

firsttable = doc.find(type=html.table, searchchildren=1)[0]

rows = firsttable.find(type=html.tr)

for row in rows:
    print row.asPlainString()

HTH,
   Walter Dörwald






More information about the Python-list mailing list