[Baypiggies] web scraping best practice question
hyperneato at gmail.com
Mon Nov 2 20:22:11 CET 2009
I wrote a Python script to send a query to a single website. I am
curious: what is the best practice for the rate of sending requests
when scraping a single site? I'll have about 4000 requests.
I thought about _politely_ writing:
for x in large_query_list:
t = random.randint(1, 5)
to pause for a psuedo-random duration between each request- so I don't
put strain on anyone's network. Does anyone have recommendations for
best practices regarding rete of sending a set of queries? I missed
the talk about web scraping from the beginning of the year.
More information about the Baypiggies