Concurrent threads to pull web pages?
exarkun at twistedmatrix.com
exarkun at twistedmatrix.com
Thu Oct 1 21:33:18 EDT 2009
On 1 Oct, 09:28 am, nospam at nospam.com wrote:
>Hello
>
> I recently asked how to pull companies' ID from an SQLite
>database,
>have multiple instances of a Python script download each company's web
>page from a remote server, eg. www.acme.com/company.php?id=1, and use
>regexes to extract some information from each page.
>
>I need to run multiple instances to save time, since each page takes
>about 10 seconds to be returned to the script/browser.
>
>Since I've never written a multi-threaded Python script before, to
>save time investigating, I was wondering if someone already had a
>script that downloads web pages and save some information into a
>database.
There's no need to use threads for this. Have a look at Twisted:
http://twistedmatrix.com/trac/
Here's an example of how to use the Twisted HTTP client:
http://twistedmatrix.com/projects/web/documentation/examples/getpage.py
Jean-Paul
More information about the Python-list
mailing list