John W. Baxter wrote:
> On 3/28/2005 12:40, "Mark Sapiro" <msapiro at value.net> wrote:
> The environment was earlier in the thread said to be unchanged.  I would ask
> Stephanie whether that environment extends to the local caching name server
> still working?  Changes to firewalling (specifically whether Ident has been
> foolishly dropped (instead of rejected) in a router or firewall, if Exim is
> set up to use it)?

The firewall I use is APF and other than adding some ROKSO spammer IP
addresses and the Spamhaus DROP list to it about once a week, there
hadn't been any changes to it.  I'm not using ident in Exim either. 
Nothing else on the server had changed or been updated except for my
normal maintenance of spam filters for Exim.  Mailman hadn't had any
changes at all for months, I installed the security patch from
February 10th a few days later and that's it.  I use cPanel/WHM for
managing my hosting customers but that hadn't had any updates in a
week, ten days before the mail delivery slowdown started.

As for DNS, Bind seems to be running fine - nameserver issues was the
first thing I thought of and after doing some checking, my server was
using external name servers hosted by my server host.  I changed that
to use my local nameserver but there was no change to the mail
delivery performance for that list.

I did uncover one bit of info today, this list is on a domain that
only hosts one other list (and practically nothing else, one tiny
website with very little activity and both lists do not keep Mailman
archives) and that other list is also having the same slow mail
delivery.  It hadn't had any posts in about a month and a message was
posted today, processed by Mailman and sent to Exim at 16:38 pm EST
today and it took 50 minutes for the message to go out from Exim to
the 226 members on individual mail delivery.  That pretty much the
same delivery stats as on the other larger list.

So now I suspect it's related to that domain.  I'm going to do some
digging in its hosting account settings, see if something got set
oddly for it.  It's one of my domains and for my own, I usually give
them all options and features with very few restrictions but I may
have mucked up something without realizing it.

Jeremy wrote:
> What's your system activity like during this 30 second pause?  If it's
> stuck in kernel or doing lots of disk I/O then I'd suspect the
> filesystem directory structure.....  Can you shut down for long enough
> to copy the exim spool to a new tree and then rename it back into place?

I thought about the spool directory - Exim was set to
"split_spool_directory = yes", had been ever since the server was set
up in fall of 2003.  I changed it to "no", that made things worse (far
too many files in one folder, so I changed it back to "yes".  I'll try
your suggestion this weekend when traffic is lower, thanks much!


