[Mailman-Users] qrunner locks timestamped in the future?

Jo Brooks jobrooks at us.dhl.com
Thu Aug 30 02:15:30 CEST 2001


I'm going to reply to my own message, because I figured out part of
the problem, but not all of it.

The messages weren't being sent out because there appears to have 
been some sort of conflict/issue with between message archiving and
message delivery. 

I'd thought that I'd turned off archiving on the high traffic lists.
Turns out I was mistaken  :)  The only reason I found this was because
I searched for *lock* in all of /home/mailman, and found some in the
archives directory for this list.

The August archive for this particular list was at 23Mb and growing;
each new message was being posted to the archive, but not getting
mailed out to the list members; they were staying in qfiles.

Turns out this was also the cause of the admin webpages for this list
either being slow or timing out...once I repeatedly refused to take
"server not responding" for an answer and finally was able to turn
archiving off, the web page response began to improve, and all the
backlogged messages zipped on their merry way.

I still don't know what was/is causing the qrunner.lock file to be
timestamped tomorrow.

Are large archives a problem for mailman?  I'd thought there were 
many high traffic lists out there that would generate archives of
this size...or not?


jojo


> First, the particulars: Solaris 7, Python 2.1, mailman 2.0.6
> 
> I have recently encountered a very strange problem.  It started today.
> 
> First symptom, was that this one list wasn't sending out messages.
> Second symptom, I was no longer able to access the admin webpages.
> Sometimes I could log in and check the configs, most of the time I
> could not.  The pages would hang.
> 
> So I thought it was a permission problem, but check_perms said all
> was well.
> 
> As far as I can tell, this started sometime this morning, based on
> all the messages in the qfiles directory for this list.
> 
> After I went on to check the qrunner and the locks, I discovered
> something strange.  The timestamps in the lock directory are dated
> ahead of current time.
> 
> root at lists:/home/mailman/locks #> date
> Wed Aug 29 14:44:01 MST 2001
> 
> root at lists:/home/mailman/locks #> ls -l
> total 20
> -rw-rw-r--   2 mailman  mailman       44 Aug 30  2001 qrunner.lock
> -rw-rw-r--   2 mailman  mailman       44 Aug 30  2001
> qrunner.lock.lists.19707
> -rw-rw-r--   2 mailman  mailman       48 Aug 29  2001 nnnnn_staff.lock
> -rw-rw-r--   2 mailman  mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19707
> -rw-rw-r--   1 nobody   mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19757
> -rw-rw-r--   1 nobody   mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19771
> -rw-rw-r--   1 nobody   mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19785
> -rw-rw-r--   1 nobody   mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19798
> -rw-rw-r--   1 nobody   mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19801
> -rw-rw-r--   1 nobody   mailman       48 Aug 29  2001
> nnnnn_staff.lock.lists.19871
> 
> I modified the timestamps manually, but....
> 
> root at lists:/home/mailman/locks #> touch -am *
> root at lists:/home/mailman/locks #> ls -l
> total 20
> -rw-rw-r--   2 mailman  mailman       44 Aug 29 14:44 qrunner.lock
> -rw-rw-r--   2 mailman  mailman       44 Aug 29 14:44
> qrunner.lock.lists.19707
> -rw-rw-r--   2 mailman  mailman       48 Aug 29 14:44 nnnnn_staff.lock
> -rw-rw-r--   2 mailman  mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19707
> -rw-rw-r--   1 nobody   mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19757
> -rw-rw-r--   1 nobody   mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19771
> -rw-rw-r--   1 nobody   mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19785
> -rw-rw-r--   1 nobody   mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19798
> -rw-rw-r--   1 nobody   mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19801
> -rw-rw-r--   1 nobody   mailman       48 Aug 29 14:44
> nnnnn_staff.lock.lists.19871
> 
> but within a few minutes, the timestamps go out of whack again.
> 
> This list is the only one that's not working...all the other lists
> on this server (about 25) are behaving just fine.  And the qrunner
> does manage to continue to timestamp itself oddly.  And every now
> and then, I'll see a qrunner process that's several minutes old.
> I saw one earlier today that was several hours old.  Killing the
> hung qrunner doesn't seem to help much.
> 
> I'm tempted to remove the list and recreate it, but I don't know
> what that would do to the messages they've missed...if this causes
> the messages to be lost, I don't want to do that.
> 
> Any ideas?  This is quickly becoming urgent.
> 
> ----
> JoJo Brooks
> DHL Worldwide Express




More information about the Mailman-Users mailing list