On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
Hello Mark,
Thank you very much for your answer. I guess the "cleanest" way would be to override the PUBLIC_EXTERNAL_ARCHIVER (we don't want to index our private for now). I'll give it a try as soon as possible. Do you think my script will interest some people ? if so, where should I post it ?
Thanks again
Best regards,
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL