[BangPypers] Mailman archives analysis

Jeff Rush jeff at taupro.com
Thu Jul 17 04:32:18 CEST 2008

Anand Balachandran Pillai wrote:
> Hi Pypers,
>          Is there any open source tool for analyzing mailman archives ?
> I want to analyze our mailman archives and then find out the following
> information.
>  - Total number of messages
> - Total number of threads (conversations)
> - Total number of unique posters
> -  Maximum size of a thread
> -  Top 5 posters
> -  Top 5 threads (in terms of size)
> Are you aware of any tool (preferably Python) which does this ? The
> tool should be client-side, taking the URL to the mailman archives
> page as the only input.
> If there is nothing like this, perhaps I could think of writing one. It
> would be useful  I guess...

I'm not aware of any such tool but it would be quite useful.  If you produce a 
library for obtaining the data, I would then hook it into the rrdtool 
(round-robin database) and produce graphs of traffic on various mailing lists. 
  This would help identify growth rates, when to split a list, dying lists, 
etc. which can help others to manage better.  Have a "top 5 posters" and "top 
5 threads" would be useful on the front page of many usergroup websites to 
encourage others to join in.

I would agree that it should be client-side since not all archive sites would 
update Mailman just to use it.  It also should cache data and not re-fetch 
"finished" (i.e. prior months) list archives it has already analyzed.  It 
should not, of course, keep a complete copy of the archive, just a summary, by 
interval of time like month.  Keep the data in SQLite or shelve, to keep 
database needs lightweight for easier integration with anyone's choice of web 

"Mailwatcher" is born?


More information about the BangPypers mailing list