[Tutor] python module to search a website
Bill Allen
wallenpb at gmail.com
Sun Feb 27 06:04:27 CET 2011
n Sat, Feb 26, 2011 at 21:11, vineeth <vineethrakesh at gmail.com> wrote:
> Hello all,
>
> I am looking forward for a python module to search a website and extract
> the url.
>
> For example I found a module for Amazon with the name "amazonproduct", the
> api does the job of extracting the data based on the query it even parses
> the url data. I am looking some more similar query search python module for
> other websites like Amazon.
>
> Any help is appreciated.
>
> Thank You
> Vin
>
I am not sure what url you are trying to extract, or from where, but I can
give you an example of basic web scraping if that is your aim.
The following works for Python 2.x.
#This one module that gives you the needed methods to read the html from a
webpage
import urllib
#set a variable to the needed website
mypath = "http://some_website.com"
#read all the html data from the page into a variable and then parse through
it looking for urls
mylines = urllib.urlopen(mypath).readlines()
for item in mylines:
if "http://" in item:
...do something with the url that was found in the page html...
...etc...
--Bill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110226/dd6bf0e1/attachment.html>
More information about the Tutor
mailing list