Help with parsing web page

Thomas Guettler guettli at thomas-guettler.de
Tue Jun 15 09:56:43 EDT 2004


Am Mon, 14 Jun 2004 17:48:33 +0100 schrieb RiGGa:

> Hi,
> 
> I want to parse a web page in Python and have it write certain values out to
> a mysql database.  I really dont know where to start with parsing the html
> code ( I can work out the database part ).  I have had a look at htmllib
> but I need more info. Can anyone point me in the right direction , a
> tutorial or something would be great.

Hi,

Since HTML can be broken in several ways, I would
pipe the HTML thru tidy first. You can use the "-asxml"
option, and then parse the xml. 

http://tidy.sourceforge.net/

 Thomas




More information about the Python-list mailing list