[Catalog-sig] PyPI outage

René Dudfield renesd at gmail.com
Sat Aug 4 04:22:15 CEST 2007


Hello,

I have had good luck with different throttling solutions in the past.
As well as using apache mod_cache, and ulimit for each app.

In summary:
- throttling, with mod_cband
- caching, with mod_cache
- limiting resources of each app, with ulimit.
- protecting from bots, with mod_security


The idea with throttling is you limit the amount of bandwidth, and the
amount of connections each ip/combination of ips has.

However there are problems with this... the main one being that some
ip addresses can have many people behind them.  Think of proxies for
AOL etc.

Also some clients have legitimate uses for the many connections.  Like
eg, some build processes at biggish companies, ie the zope people etc,
or conferences where 300+ people will connect from the same ip etc
etc.

The other problem is that some robots use many separate ip addresses -
but that isn't the common case.

I think mod_cband enabled on the wiki as well as enabling caching with
mod_cache for moinmoin would help quite a lot.  Or implementing just
one of caching or bandwidth limiting would help.

I can't think of that many legitimate uses where people would want to
download heaps of wiki pages like the spamming robots are.  Also as
you say it appears to be the wiki causing all the load at the moment -
probably generic moinmoin spamming robots.  So it might be best to
enable mod_cband on the wiki first, rather than on pypi.  mod_cband
can be enabled separately on each vhost.

Here are some good urls you can read to start research on bandwidth
limiting (there are many links off these pages to tutorials, howtos,
articles etc).
http://mod-cband.com/
http://gentoo-wiki.com/HOWTO_Apache_2_bandwidth_limiting



mod_security http://www.modsecurity.org/  is another option that can
help with many types of attacks.  However it can be more complex to
configure.


Another thing to do is to use ulimit to limit the resources that each
application can use.  This way if the wiki is being abused, it can
cause less damage to the rest of the machine.  Type ulimit -a to see
what you can do.  Just put some ulimit lines in the application start
up script.  Using ulimit will not fix the problem, just limit the
possible damage.  eg. you can limit the amount of memory used, and the
amount of open files etc.


For moinmoin, you could probably ask on the moinmoin mailing list for
solutions to this problem, since it is probably quite common.

Cheers,


More information about the Catalog-SIG mailing list