[BangPypers] Mailman archives analysis

Dorai Thodla dorai at thodla.com
Thu Jul 17 04:49:56 CEST 2008


May be slightly unrelated but inspired by this discussion.

How about a tag cloud for mailman archives?  In fact, if can have several
tag clouds - a tg for posters, a tg for subjects and a tg for terms. This
can be an alternate view of the mailing lists and we can parameterize it
with the timeline (current month, current year etc.) and number of tags (top
20, 50 etc.)

If we can create two modules ( a ListAnalyzer and a Tag Cloud Generator) and
couple these two, this can be used for different discussion groups too by
changing the ListAnalyzer (one for mailman, one for yahoo groups etc. )

Dorai
www.thodla.com

On Thu, Jul 17, 2008 at 8:02 AM, Jeff Rush <jeff at taupro.com> wrote:

> Anand Balachandran Pillai wrote:
>
>> Hi Pypers,
>>
>>         Is there any open source tool for analyzing mailman archives ?
>> I want to analyze our mailman archives and then find out the following
>> information.
>>
>>  - Total number of messages
>> - Total number of threads (conversations)
>> - Total number of unique posters
>> -  Maximum size of a thread
>> -  Top 5 posters
>> -  Top 5 threads (in terms of size)
>>
>> Are you aware of any tool (preferably Python) which does this ? The
>> tool should be client-side, taking the URL to the mailman archives
>> page as the only input.
>>
>> If there is nothing like this, perhaps I could think of writing one. It
>> would be useful  I guess...
>>
>
> I'm not aware of any such tool but it would be quite useful.  If you
> produce a library for obtaining the data, I would then hook it into the
> rrdtool (round-robin database) and produce graphs of traffic on various
> mailing lists.  This would help identify growth rates, when to split a list,
> dying lists, etc. which can help others to manage better.  Have a "top 5
> posters" and "top 5 threads" would be useful on the front page of many
> usergroup websites to encourage others to join in.
>
> I would agree that it should be client-side since not all archive sites
> would update Mailman just to use it.  It also should cache data and not
> re-fetch "finished" (i.e. prior months) list archives it has already
> analyzed.  It should not, of course, keep a complete copy of the archive,
> just a summary, by interval of time like month.  Keep the data in SQLite or
> shelve, to keep database needs lightweight for easier integration with
> anyone's choice of web engine.
>
> "Mailwatcher" is born?
>
> -Jeff
>
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>



-- 
Dorai Thodla (http://www.thodla.com)
Thinking about Technology Innovation and Learning
My DailyLog (http://dorai.tumblr.com/) - Stuff worth remembering
US: 650-206-2688, India: 98408 89258
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/bangpypers/attachments/20080717/651d9a0a/attachment.htm>


More information about the BangPypers mailing list