[Mailman-Users] Optimizations

Brad Knowles brad at stop.mail-abuse.org
Fri Nov 12 11:28:33 CET 2004


At 2:42 AM -0700 2004-11-12, Tierra wrote:

>  Yes, that is 10% of mem on the first listed process, and yes, I'm
>  tight on ram right now.

	Mailman 2.1.x will have eight or nine qrunner processes 
constantly in memory, but they shouldn't be running unless they have 
actual work to do.


	Depending on what OS you're using, the system may try to keep 
everything in memory that it can, so as to make maximum use of what 
is available.  Anything that is not currently running is liable to be 
paged out in favour of other processes, filesystem/disk caching, 
etc....

	On many systems I'm familiar with, it is not at all uncommon to 
see what appears to be just a few KB "free", but on closer inspection 
you discover that most of the memory that is "used" is actually 
"cache" or "inactive", and therefore available for immediate page-out 
and re-use by other processes.


	Here's a FreeBSD 5.2.1 system I help administer:

USER       PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME COMMAND
mailman  53524  0.0  0.0  7928   12  ??  Ss   Wed02PM   0:00.46 mailmanctl
mailman  54142  0.0  1.5  8544 3828  ??  S    Wed02PM   0:57.26 VirginRunner
mailman  54143  0.0  0.7  7892 1844  ??  S    Wed02PM   0:55.32 CommandRunner
mailman  54144  0.0  1.7  8592 4252  ??  S    Wed02PM   1:03.93 IncomingRunner
mailman  54145  0.0  0.4  7892 1064  ??  S    Wed02PM   0:00.69 RetryRunner
mailman  54146  0.0  0.7  8328 1888  ??  S    Wed02PM   0:57.06 NewsRunner
mailman  54147  0.0  0.8  8512 2036  ??  S    Wed02PM   0:59.44 BounceRunner
mailman  54148  0.0  1.1 10180 2784  ??  S    Wed02PM   1:18.16 ArchRunner
mailman  54149  0.0  1.7  8940 4332  ??  S    Wed02PM   1:47.38 OutgoingRunner

	On this machine, the first few lines of "top" shows:

last pid: 75984;  load averages:  0.00,  0.00,  0.00   up 10+08:30:45  02:13:03
79 processes:  3 running, 76 sleeping
CPU states: 14.3% user,  0.0% nice, 23.8% system,  0.0% interrupt, 61.9% idle
Mem: 132M Active, 20M Inact, 60M Wired, 6280K Cache, 34M Buf, 25M Free
Swap: 513M Total, 73M Used, 440M Free, 14% Inuse


	Here's a Debian Linux (kernel 2.4.26) machine I help administer:

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
mailman   5130  0.0  0.1  5828  2088 ?       S    Jul06   0:00 mailmanctl
mailman   5131  2.0  1.6 54028 34896 ?       S    Jul06 3807:00 ArchRunner
mailman   5132  0.3  0.7 25740 15252 ?       S    Jul06 606:43 BounceRunner
mailman   5133  0.0  0.7 19328 15608 ?       S    Jul06  73:38 CommandRunner
mailman   5134  0.1  0.7 18696 16040 ?       S    Jul06 305:05 IncomingRunner
mailman   5135  0.0  0.3  9212  6840 ?       S    Jul06  43:38 NewsRunner
mailman   5136  2.4  1.1 25316 22816 ?       S    Jul06 4528:36 OutgoingRunner
mailman   5137  0.1  0.7 16828 14500 ?       S    Jul06 307:12 VirginRunner
mailman   5138  0.0  0.0  9624  1848 ?       S    Jul06   0:03 RetryRunner
mailman  19970  0.0  0.0 10184  1592 ?       S    Aug21   0:00 gate_news

	Top shows:

  11:16:51 up 129 days, 14:07,  1 user,  load average: 0.08, 0.18, 0.22
145 processes: 142 sleeping, 2 running, 1 zombie, 0 stopped
CPU states:  71.4% user,  33.0% system,   0.9% nice,  2333.9% idle
Mem:   2069316K total,  2020500K used,    48816K free,    53956K buffers
Swap:  1951888K total,   191316K used,  1760572K free,   935268K cached


	Both of these machines are effectively completely idle at the 
moment, and yet neither of them has a whole lot of memory that, on 
first glance, would appear to be free.  If you really want to find 
out whether or not you're tight on memory that is actively being 
used, with your system thrashing about trying to always free up 
memory from processes that are fighting for the same resources, you 
need to use other tools to investigate this matter.  One good tool is 
"iostat", another one is "vmstat".

	Looking at that FreeBSD machine again, vmstat shows:

% vmstat 1 20
  procs      memory      page                    disks     faults      cpu
  r b w     avm    fre  flt  re  pi  po  fr  sr da0 da1   in   sy  cs us sy id
  1 0 0  500148  49132   58   1   0   0  36  95   0   0  363    0 317  2  1 97
  0 0 0  500148  49132    5   0   0   0   5   0   0   0  365    0 304  0  4 96
  0 0 0  500148  49132    0   0   0   0   1   0   0   0  357    0 291  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  358    0 284  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  357    0 284  1  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  361    0 298  1  2 97
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  369    0 312  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   3   0  369    0 341  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  364    0 296  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  357    0 287  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  361    0 301  2  2 97
  0 0 0  500148  49132    0   0   0   0   4   0   9   0  386    0 344  1  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  367    0 301  1  4 95
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  360    0 288  1  3 96
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  357    0 286  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   0   0  361    0 301  0  2 98
  0 0 0  500148  49132    0   0   0   0   0   0   2   0  368    0 312  1  2 98
  2 0 0  500148  49132    0   0   0   0   1   0   0   0  358    0 290  2  2 96
  2 0 0  500148  49132    0   0   0   0   0   0   0   0  357    0 285  1  3 96
  2 0 0  500148  49132    0   0   0   0   0   0   0   0  358    0 289  0  3 97

	Looking at the Linux box, vmstat shows:

% vmstat 1 20
    procs                      memory    swap          io     system         cpu
  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
  0  0  0 191316  51020  54252 936864   0   0    14    15    9    19   2  10  18
  0  0  0 191316  50940  54280 936880   0   0     4   256  251   329   0   0  99
  0  0  0 191316  50828  54284 936900   0   0     8     0  263   422   0   0  99
  0  0  0 191316  50780  54288 936916   0   0     8     0  241   414   0   1  99
  0  0  0 191316  50016  54292 936932   0   0     4     0  216   345   0   1  99
  0  0  0 191316  49244  54292 936940   0   0     0     0  195   248   0   0 100
  0  0  0 191316  50588  54316 936956   0   0     0  1160  300   644   3   2  95
  0  0  0 191316  50516  54328 936968   0   0     8    64  222   252   1   0  99
  0  0  0 191316  50360  54344 936996   0   0    24     0  236   324   0   1  99
  0  0  0 191316  49708  54352 936980   0   0    16     0  246   433   1   0  98
  0  0  0 191316  50364  54360 936992   0   0    12     0  315   466   0   1  99
  0  0  0 191316  48820  54376 937004   0   0     0   272  220   314   0   1  99
  0  0  0 191316  49112  54380 936936   0   0     4     0  225   343   1   0  99
  0  0  0 191316  50836  54388 936920   0   0     4     0  206   304   3   1  96
  0  0  0 191316  50772  54388 936932   0   0     0     0  174   171   0   0 100
  0  1  0 191316  50728  54392 936944   0   0    12     0  237   435   0   0 100
  0  0  0 191316  50696  54412 936956   0   0     0   220  193   221   0   1  99
  0  0  0 191316  50640  54416 936968   0   0     8     0  187   120   0   0 100
  0  0  0 191316  49928  54416 936984   0   0     0     0  214   362   1   0  99
  0  0  0 191316  50576  54416 936992   0   0     0     0  215   344   0   0  99


	For the FreeBSD box, look at the columns for "pi" (page in) and 
"po" (page out).  This machine isn't doing any paging at all, which 
means that there is no memory pressure.  It may appear to be short on 
memory, but that's only because the system is keeping everything in 
memory that it can, and it hasn't needed to page anything out that 
it's got currently loaded.  You can also look at the columns for "fr" 
(free) and "sr" (scan rate).  The former is "pages freed per second", 
and the latter is "pages scanned by clock algorithm, per-second". 
Both fields show that this system is doing very little in either of 
these categories, and confirms the conclusions drawn from the pi/po 
columns.


	To go any further into this topic, you really have to know more 
about your OS and how to do proper performance monitoring, analysis, 
and tuning for it.  Of course, that is really beyond the scope of 
this mailing list.

-- 
Brad Knowles, <brad at stop.mail-abuse.org>

"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."

     -- Benjamin Franklin (1706-1790), reply of the Pennsylvania
     Assembly to the Governor, November 11, 1755

   SAGE member since 1995.  See <http://www.sage.org/> for more info.



More information about the Mailman-Users mailing list