[Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)
Scott Lambert
lambert at lambertfam.org
Fri Oct 31 21:52:34 CET 2003
On Fri, Oct 31, 2003 at 09:40:11AM -0500, Jon Carnes wrote:
> On Fri, 2003-10-31 at 09:26, Jay West wrote:
> > I'm using Mailman 2.1.2 on FreeBSD v4.8-Release, built using the port. MTA
> > is sendmail 8.12.8p1
> >
> > Very frequently I will see the ArchRunner process using 99+ % of cpu. I have
> > searched the archives and found lots of messages about qrunners using large
> > percentages of cpu, but they all seem to talk about the fixes being related
> > to actual mail processing (sendmail), not archRunner. I am assuming that if
> > the problem was mail delivery or reception I would be seeing the large cpu
> > use on a different qrunner process. My issue is specific to the archrunner
> > process which I don't find much on in the archives/faq.
> >
> Well you've pegged it. That was a bug in version 2.1.2 which is fixed
> in 2.1.3. The patch for 2.1.2 should still be available - you could
> probably patch your running system and just leave it at that (an upgrade
> will bring the patch in anyway).
I still see this problem with Mailman 2.1.3 for a high-volume list.
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND
66428 mailman 64 0 168M 147M CPU1 0 376.7H 99.02% 99.02% python2.3
That's the archiver process. There are 1318 messages in the archive
queue...
12:00:28 Fri Oct 31 # truss -p 66428
break(0x114f6000) = 0 (0x0)
break(0x1302c000) = 0 (0x0)
break(0x114f8000) = 0 (0x0)
break(0x13030000) = 0 (0x0)
break(0x114fa000) = 0 (0x0)
break(0x13034000) = 0 (0x0)
break(0x114fc000) = 0 (0x0)
break(0x13038000) = 0 (0x0)
break(0x114fe000) = 0 (0x0)
break(0x1303c000) = 0 (0x0)
break(0x11500000) = 0 (0x0)
break(0x13040000) = 0 (0x0)
break(0x11502000) = 0 (0x0)
break(0x13044000) = 0 (0x0)
break(0x11504000) = 0 (0x0)
break(0x13048000) = 0 (0x0)
break(0x11506000) = 0 (0x0)
break(0x1304c000) = 0 (0x0)
Once I kill off the mailman queue runners and clean up the several lock
files for this mailing list, it runs just fine and manages to empty the
archive queue.
Two days worth of mailman cron jobs were still stuck in the process list.
Supposition: Maybe they were blocked by the list's lockfile?
So, it seems that the archRunner process went off the deep end somewhere
between two and three days ago.
I have the htdig patches for 2.1.3 installed. Which might be germane...
--
Scott Lambert KC5MLE Unix SysAdmin
lambert at lambertfam.org
More information about the Mailman-Users
mailing list