Concurrent threads to pull web pages?
python at mrabarnett.plus.com
Fri Oct 2 03:46:33 CEST 2009
Gilles Ganault wrote:
> I recently asked how to pull companies' ID from an SQLite database,
> have multiple instances of a Python script download each company's web
> page from a remote server, eg. www.acme.com/company.php?id=1, and use
> regexes to extract some information from each page.
> I need to run multiple instances to save time, since each page takes
> about 10 seconds to be returned to the script/browser.
> Since I've never written a multi-threaded Python script before, to
> save time investigating, I was wondering if someone already had a
> script that downloads web pages and save some information into a
> Thank you for any tip.
You could put the URLs into a queue and have multiple worker threads
repeatedly get a URL from the queue, download the page, and then put the
page into another queue for processing by another extraction thread.
This post might help:
More information about the Python-list