[Mailman-Developers] listmembers being "(ignored)", lockfile problems, 2.0b5

Wed, 30 Aug 2000 12:17:16 -0400

Greetings.

I realize this is the developer list, so I'll explain why I'm posting 
here and not there.

I already posted there.

I also searched through my archives.  I couldn't search through the 
list archives because the search engine on python.org appears to be 
broken(and yes, I followed the instructions, choosing only SIG 
archives.)

I found a bunch of posts related to my problems, but not much in the 
way of resolutions.  So, I emailed the list with my particular flavor 
of the problem, and I have not received an answer back yet.

My general impression is that the mailman-users list is -very- 
helpful for well-known, solved issues.  In one case, I received a 
helpful reply back within minutes when I sent a message out at 4am 
EST.

However, for anything new, the users list is pretty much not helpful, 
so I'm coming here.  My hope is that this information will also help 
with developemnt of b6.

The setup we're running is Solaris, with python 1.5.2 compiled 
myself.  The list was working great, until people started loosing 
mail delivey. I thought the auto-bounce handler was removing people, 
but they're still listed as subscribed, and I never received a 
message from the bounce handler saying they would be removed.

Now, what I'm seeing in the logs:

in logs/smtp-failure:

smtp-failure:Aug 30 11:16:15 2000 (122) -1 <email addr> (ignore)
(many, many times)

This user is in fact not getting email from either of the two lists 
he belongs to.  The number in ()'s changes, though not with every 
entry in the logs.

and, in logs/post, I see constant entries about:
Aug 30 04:53:08 2000 (23571) post to <listname> from 
<listname>-request@<hostname>, size=28501, 177 failures

(always the same number of failures and size; seems to be some sort 
of unrelated mail loop of some kind.)

and there are posts that go through:

Aug 30 04:44:42 2000 (23411) post to <listname> from <some 
subscriber's email address>, size=528, success

Qrunner seems to be working:
Aug 30 02:55:04 2000 (22304) qrunner begining
Aug 30 02:55:09 2000 (22304) qrunner ended

(this is repeated every time qrunner is launched, which is wasteful 
in terms of space since it runs once a minute; I've got 56k of text 
for less than 12 hours of running)

Now, this mystery "177 failures" message shows up again:
Aug 30 06:50:10 2000 (25068) All recipients refused: please run connect() first
Aug 30 06:50:10 2000 (25068) smtp for 177 recips, completed in 2.620 seconds

In logs/bounce, I'm not sure I understand the meanings:

Aug 29 23:13:31 2000 (18077) <listname>: <someuser> - 27 more allowed 
over 426977 secs

then there will be:

Aug 29 23:13:33 2000 (18077) <listname>: <differentuser> - 0 more 
allowed over 395006 secs

The non-delivery problem affects digest and non-digest users. 
Archives are working fine.  One list is 1,500 people; 500 normal, 
1000 digest users; the other is about 100 people, mostly non-digest. 
We're using sendmail, and I have the threaded delivery turned on due 
to the large number of slow/bad mail servers(and experience has shown 
that splitting them up into chunks results in a dramatic improvement 
in delivery times.)

Thanks all for any help/tips.  I wouldn't mind applying some patches 
that have been introduced since 2.0b5 that will fix this stuff, if 
they're available.

Brett