[BangPypers] Mailman archives analysis

Anand Balachandran Pillai abpillai at gmail.com
Thu Jul 17 08:41:21 CEST 2008


Another important reason for wanting this to be client-side is,

 o We don't limit it to a specific mailing list back-end like mailman.

--Anand

On Thu, Jul 17, 2008 at 12:06 PM, Anand Balachandran Pillai
<abpillai at gmail.com> wrote:
> Not really. I said it should be a client-side tool because,
>
>  o Like Jeff said, not all archives would want to update server-side
> mailman just
>    to use it.
>  o Being server side introduces additional things to worry about like
> authentication,
>    if we don't integrate with mailman and use it as a kind of plugin
> on the server side.
> o  Last but the most important, it introduces a learning curve to
> learn mailman and
>    understand mailman before it can be used. So the effectiveness of
> the tool is
>    limited to mailman administrators. I am thinking of a tool which
> anyone can use
>    by just analyzing the mailman archive web-pages as a client.
>
> The features looking from that perspective would be,
>
>  o Client side tool
>  o Input is mailman archive web page - either the complete archive view
>     or monthly view
>  o Output is written to an sqlite database.
>  o Tags support would be great - this can be done by finding out keywords
>     in top conversations and creating a tag cloud based on the frequencies
>     of these keywords.
>
> This is not a plug for HarvestMan. In fact HarvestMan will be over-kill for
> a tool like this since we really don' t need a complete parsing of the pages
> to find what we want - smart regular expressions tailored towards exactly
> the data we want would be enough. I would use Pyparsing here instead
> of regexp.
>
> Instead the tool should start from scratch with its own minimal crawling
> loop which extracts only the data we want and nothing more. The focus
> is on analysis, not on crawling, if you get what I mean...
>
> +1 for "MailWatcher"...
>
> I will think about a list of features and a design for MailWatcher and post
> it to the list soon.
>
> --Anand
>
> On Thu, Jul 17, 2008 at 6:19 AM, Pradeep Gowda <pradeep at btbytes.com> wrote:
>>
>> On Jul 16, 2008, at 8:40 PM, O.R.Senthil Kumaran wrote:
>>>
>>> If you look into the methods exposed by Mailman and modify to add those
>>> features to the web-interface, would it not be better idea?  It would be
>>> available to existing mailman users when they upgrade(or patch it).
>>>
>>> Something like <mm:TopUsers=5/> in the web-interface which internally
>>> calls the
>>> method for getting the top-posters.
>>>
>>
>> But that would be no good for Anand's Harvestman code-kata[1] ;)
>>
>> +PG
>> [1] http://codekata.pragprog.com/codekata/
>>
>> _______________________________________________
>> BangPypers mailing list
>> BangPypers at python.org
>> http://mail.python.org/mailman/listinfo/bangpypers
>>
>
>
>
> --
> -Anand
>



-- 
-Anand


More information about the BangPypers mailing list