[Tutor] html table parse

Alan Gauld alan.gauld at btinternet.com
Wed Nov 7 23:07:06 CET 2012


On 07/11/12 20:50, Christopher Conner wrote:

> Python has a native HTML parser.
>
> http://docs.python.org/2/library/htmlparser.html

It does, but frankly BS is much easier to use and more forgiving.
I wouldn't recommend that the OP drop BS to use htmlparser

To the OP, do you understand HTML? parsing a table is no different from 
parsing a heading or any other tag. You need to understand the structure 
of the page you are parsing but the principle is the same.

There is one other parsing library that looks promising but I haven't 
had a chance to use it in anger yet.

Its called pyQuery and is similar in principle to JQuery. It allows you 
to search by CSS style as well as HTML tags and combinations thereof... 
It looks very promising but I don;t know what performance or real world 
usability is like... But if you already know JQuery it looks like a 
useful tool.

API here:
http://packages.python.org/pyquery/api.html



-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/



More information about the Tutor mailing list