[Tutor] HTML Parsing

Mitesh H. Budhabhatti mitesh.budhabhatti at gmail.com
Fri May 30 07:39:55 CEST 2014


Alan Gauld thanks for the reply.  I'll try that out.

Warm Regards,
*Mitesh H. Budhabhatti*
Cell# +91 99040 83855


On Wed, May 28, 2014 at 11:19 PM, Danny Yoo <dyoo at hashcollision.org> wrote:

> > I am using Python 3.3.3 on Windows 7.  I would like to know what is the
> best
> > method to do HTML parsing?  For example, I want to connect to
> www.yahoo.com
> > and get all the tags and their values.
>
>
> For this purpose, you may want to look at the APIs that the search
> engines provide, rather than try to web-scrape the human-focused web
> pages.  Otherwise, your program will probably be fragile to changes in
> the structure of the web site.
>
>
> A search for search APIs comes up with hits like this:
>
>     https://developer.yahoo.com/boss/search/
>
>     https://developers.google.com/web-search/docs/#fonje_snippets
>
>     http://datamarket.azure.com/dataset/bing/search
>
>     https://pypi.python.org/pypi/duckduckgo2
>
>
> If you can say more about what you're planning to do, perhaps someone
> has already provided a programmatic interface to it.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20140530/0d623838/attachment.html>


More information about the Tutor mailing list