Fetching a bunch of RDF-files in an asynchron, resource-friendly way ?
2002 at weholt.org
Tue Feb 18 16:08:33 CET 2003
I got a bunch of RDF-files on several different sites I need to download
several times a day. What is the most fastest, most effective way of doing
this, while keeping the bandwidth use to a minimum? The fetched data will be
kept in memory between fetches, and if possible I'd like to check
HTTP-headers for modification times to skip files that are not updated. Is
there some asynchronous or threaded way this can be done? I have a feeling
that my approach of having a bunch of threads or doing them in serial is not
Using something like Twisted is ok, since the code will be used in that
framework, but a pure python way is preferred, but speed/low resource-cost
wins if there's more than one solution.
Any clues or hints are appreciated.
More information about the Python-list