[BangPypers] Mailman archives analysis
Anand Balachandran Pillai
abpillai at gmail.com
Thu Jul 17 08:41:21 CEST 2008
Another important reason for wanting this to be client-side is,
o We don't limit it to a specific mailing list back-end like mailman.
On Thu, Jul 17, 2008 at 12:06 PM, Anand Balachandran Pillai
<abpillai at gmail.com> wrote:
> Not really. I said it should be a client-side tool because,
> o Like Jeff said, not all archives would want to update server-side
> mailman just
> to use it.
> o Being server side introduces additional things to worry about like
> if we don't integrate with mailman and use it as a kind of plugin
> on the server side.
> o Last but the most important, it introduces a learning curve to
> learn mailman and
> understand mailman before it can be used. So the effectiveness of
> the tool is
> limited to mailman administrators. I am thinking of a tool which
> anyone can use
> by just analyzing the mailman archive web-pages as a client.
> The features looking from that perspective would be,
> o Client side tool
> o Input is mailman archive web page - either the complete archive view
> or monthly view
> o Output is written to an sqlite database.
> o Tags support would be great - this can be done by finding out keywords
> in top conversations and creating a tag cloud based on the frequencies
> of these keywords.
> This is not a plug for HarvestMan. In fact HarvestMan will be over-kill for
> a tool like this since we really don' t need a complete parsing of the pages
> to find what we want - smart regular expressions tailored towards exactly
> the data we want would be enough. I would use Pyparsing here instead
> of regexp.
> Instead the tool should start from scratch with its own minimal crawling
> loop which extracts only the data we want and nothing more. The focus
> is on analysis, not on crawling, if you get what I mean...
> +1 for "MailWatcher"...
> I will think about a list of features and a design for MailWatcher and post
> it to the list soon.
> On Thu, Jul 17, 2008 at 6:19 AM, Pradeep Gowda <pradeep at btbytes.com> wrote:
>> On Jul 16, 2008, at 8:40 PM, O.R.Senthil Kumaran wrote:
>>> If you look into the methods exposed by Mailman and modify to add those
>>> features to the web-interface, would it not be better idea? It would be
>>> available to existing mailman users when they upgrade(or patch it).
>>> Something like <mm:TopUsers=5/> in the web-interface which internally
>>> calls the
>>> method for getting the top-posters.
>> But that would be no good for Anand's Harvestman code-kata ;)
>>  http://codekata.pragprog.com/codekata/
>> BangPypers mailing list
>> BangPypers at python.org
More information about the BangPypers