html parsing? Or just simple regex'ing?
Michael J. Fromberger
Michael.J.Fromberger at Clothing.Dartmouth.EDU
Thu Nov 11 15:05:27 EST 2004
In article <pan.2004.11.10.01.37.41.879705 at dcs.nac.uci.edu>,
Dan Stromberg <strombrg at dcs.nac.uci.edu> wrote:
> I'm working on writing a program that will synchronize one database with
> another. For the source database, we can just use the python sybase API;
> that's nice and normal.
>
> [...]
>
> 1) Would I be better off just regex'ing the html I'm getting back? (I
> suppose this depends on the complexity of the html received, eh?)
>
> 2) Would I be better off feeding the HTML into an HTML parser, and then
> traversing that datastructure (is that really how it works?)?
I recommend you look at BeautifulSoup:
http://www.crummy.com/software/BeautifulSoup/
It is very forgiving of the typical affronts HTML writers put into their
code.
-M
--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA
More information about the Python-list
mailing list