parallel programming in Python

Thu May 10 08:46:01 EDT 2012

On Thu, May 10, 2012 at 8:14 AM, Jabba Laci <jabba.laci at gmail.com> wrote:
> What's the best way?

>From what I've heard, http://scrapy.org/ . It is a single-thread
single-process web crawler that nonetheless can download things
concurrently.

Doing what you want in Scrapy would probably involve learning about
Twisted, the library Scrapy works on top of. This is somewhat more
involved than just throwing threads and urllib and lxml.html together,
although most of the Twisted developers are really helpful. It might
not be worth it to you, depending on the size of the task.

Dave's answer is pretty general and good though.

-- Devin