Generic web parser

S.Selvam s.selvamsiva at gmail.com
Sat May 16 15:18:13 CEST 2009


Hi all,

I have to design web parser which will visit the given list of websites and
need to fetch a particular set of details.
It has to be so generic that even if we add new websites, it must fetch
those details if available anywhere.
So it must be something like a framework.

Though i have done some parsers ,but they will parse for a given format(For.
eg It will get the data from <title> tag).But here each website may have
different format and the information may available within any tags.

I know its a tough task for me,but i feel with python it should be possible.
My request is, if such thing is already available please let me know ,also
your suggestions are welcome.

Note: I planned to use BeautifulSoup for parsing.

-- 
Yours,
S.Selvam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090516/3bfa1fb6/attachment.html>


More information about the Python-list mailing list