Concurrent threads to pull web pages?

Fri Oct 2 03:46:33 CEST 2009

Gilles Ganault wrote:
> Hello
> 	I recently asked how to pull companies' ID from an SQLite database,
> have multiple instances of a Python script download each company's web
> page from a remote server, eg., and use
> regexes to extract some information from each page.
> I need to run multiple instances to save time, since each page takes
> about 10 seconds to be returned to the script/browser.
> Since I've never written a multi-threaded Python script before, to
> save time investigating, I was wondering if someone already had a
> script that downloads web pages and save some information into a
> database.
> Thank you for any tip.

You could put the URLs into a queue and have multiple worker threads
repeatedly get a URL from the queue, download the page, and then put the
page into another queue for processing by another extraction thread.
This post might help:

