HTML Parsing
Dan Stromberg
dstromberglists at gmail.com
Sat Jun 28 22:38:22 EDT 2008
On Sat, 28 Jun 2008 19:03:39 -0700, disappearedng wrote:
> Hi everyone
> I am trying to build my own web crawler for an experiement and I don't
> know how to access HTTP protocol with python.
>
> Also, Are there any Opensource Parsing engine for HTML documents
> available in Python too? That would be great.
Check out BeautifulSoup. I don't recall what license it uses, but the
source is available, and it deals well with not-necessarily-beautiful-
inside HTML.
More information about the Python-list
mailing list