Bind threads to addresses -- Windows & urllib?

Steve Holden sholden at holdenweb.com
Sat Sep 7 21:00:39 CEST 2002


"Nick Arnett" <narnett at mccmedia.com> wrote ...
>
>
> --
> Nick Arnett
> Phone/fax: (408) 904-7198
> narnett at mccmedia.com
>
> > That's not gonna help you at all, because all threads will
> > be capped by the interface's top throughput, or by your
> > machine's processing power.
>
> You're making an incorrect assumption about the purpose of using multiple
> addresses.  It has nothing to do with my end of the connection; it is to
> cope with servers that regard a reasonably well-behaved spider (in my
> opinion, at least) as a denial-of-service attack.  If the server operators
> would reveal what they regard as well-behaved, I wouldn't have to resort
to
> this, but as one might expect, nobody wants to disclose the parameters of
> their DOS defenses.  I can't get the relevant sites to even respond to
> inquries... and they don't even have a robots.txt file.
>
> On the other hand, the more I think about this, the less interested I am
in
> bothering, since it would surely be easy for a server to block a range of
> addresses.
>
> I'm also working the other obvious solution, heuristics for the spider to
> set its own speed so that it won't trigger  defenses -- but it appears
> there's something more than simple rules at the other end.  The robot wars
I
> expected years ago have arrived...
>
> No lectures on what consists good robot behavior, please -- I've been
> operating *the* list on that subject for years
> (http://www.mccmedia.com/mailman/listinfo/robots)
>

Nick:

Have you considered that it may simply be the lack of the usual headers
produced by browser that causes the remote servers to reject your traffic?

regards
-----------------------------------------------------------------------
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                        pydish.holdenweb.com/pwp/
Previous .sig file retired to                    www.homeforoldsigs.com
-----------------------------------------------------------------------






More information about the Python-list mailing list