Bind threads to addresses -- Windows & urllib?

Steve Holden sholden at
Sat Sep 7 21:00:39 CEST 2002

"Nick Arnett" <narnett at> wrote ...
> --
> Nick Arnett
> Phone/fax: (408) 904-7198
> narnett at
> > That's not gonna help you at all, because all threads will
> > be capped by the interface's top throughput, or by your
> > machine's processing power.
> You're making an incorrect assumption about the purpose of using multiple
> addresses.  It has nothing to do with my end of the connection; it is to
> cope with servers that regard a reasonably well-behaved spider (in my
> opinion, at least) as a denial-of-service attack.  If the server operators
> would reveal what they regard as well-behaved, I wouldn't have to resort
> this, but as one might expect, nobody wants to disclose the parameters of
> their DOS defenses.  I can't get the relevant sites to even respond to
> inquries... and they don't even have a robots.txt file.
> On the other hand, the more I think about this, the less interested I am
> bothering, since it would surely be easy for a server to block a range of
> addresses.
> I'm also working the other obvious solution, heuristics for the spider to
> set its own speed so that it won't trigger  defenses -- but it appears
> there's something more than simple rules at the other end.  The robot wars
> expected years ago have arrived...
> No lectures on what consists good robot behavior, please -- I've been
> operating *the* list on that subject for years
> (


Have you considered that it may simply be the lack of the usual headers
produced by browser that causes the remote servers to reject your traffic?

Steve Holden                        
Python Web Programming              
Previous .sig file retired to          

More information about the Python-list mailing list