[BangPypers] Website change tracker

Bhavya bhavya.mayur at gmail.com
Fri Jun 8 19:51:40 CEST 2012


Thanks everyone:)...Much appreciated.
I will work on it & let the group know how it goes.

Thanks,
Bhavya

On Fri, Jun 8, 2012 at 1:06 PM, vid <vid at svaksha.com> wrote:

> On Fri, Jun 8, 2012 at 4:09 PM, kracethekingmaker
> <kracethekingmaker at gmail.com> wrote:
> >
> >> Hello,
> >>
> >> I am newbie to Python coding. And, I had a question. I want to write a
> >> script which will check content changes in websites&  send e-mail to a
> >>
> >> admin whenever there are changes.
> >
> > How many times in a day or how often will this check be performed ?
> >
> > You must look into how to use md5, diff utilities, for web scraping
> scrapy
> > library is advised.
> >
> >> Ideally this script/program should be scalable for say about 1000
> websites
> >> at a time..
>
> 1000 sites at a time? Wow, that's huge. Scraping that many sites is
> resource intensive, would need a nice big stable server that can
> handle the huge data dumps. Fwiw, Scrapy will only dump the data in
> the json files so check out a little about the database you want to
> use, the frontend to serve it, a queueing system to scale 1000 sites,
> etc... Also, some sites instantly ban scrapers. Watch out for that,
> and goodluck :)
>
> --
> Regards,
> Vid
>http://svaksha.com> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>


More information about the BangPypers mailing list