Best Mail Program to Use w/ Mailman
We are setting up a Debian web server and would like to use Mailman to manage a couple of mailing lists that we control. After some initial complications with Mailman and Postfix we decided to uninstall and reinstall everything, before we get going again, I just wanted to get an idea of what the best mail program, taken from the Mailman Documentation, to use with Mailman... Postfix, Qmail, Exim, or Sendmail.
Thanks,
ClayB
From --- REDACTED --- Mon Feb 22 16:56:47 2010 Return-Path: <--- REDACTED ---> X-Original-To: mailman-users@python.org Delivered-To: mailman-users@mail.python.org Received: from albatross.python.org (localhost [127.0.0.1]) by mail.python.org (Postfix) with ESMTP id 0E26BF67E for <mailman-users@python.org>; Mon, 22 Feb 2010 16:56:47 +0100 (CET) X-Spam-Status: OK 0.059 X-Spam-Evidence: '*H*': 0.89; '*S*': 0.01; 'gmbh': 0.07; 'subject:donation': 0.07; 'url:gnu': 0.07; 'freundlichen': 0.09; 'paypal.': 0.09; 'to:2**1': 0.15; '+49': 0.16; 'amtsgericht': 0.16; 'desired,': 0.16; 'gr\xfc\xdfen': 0.16; 'holger': 0.16; 'hrb': 0.16; 'listing.': 0.16; 'to:addr:fsf.org': 0.16; 'uwe': 0.16; 'cc:2**0': 0.17; 'url:software': 0.19; 'mailman': 0.21; 'ago': 0.22; 'list': 0.25; 'url:mailman': 0.25; 'x-mailer:microsoft office outlook 11': 0.25; 'thanks': 0.28; 'to:no real name:2**1': 0.29; 'url:de': 0.29; 'mit': 0.31; 'program.': 0.31; 'thank': 0.33; 'to:addr:mailman-users': 0.33; 'development': 0.36; 'page.': 0.36; 'received:de': 0.37; '8bit%:7': 0.37; 'to:addr:python.org': 0.39; 'your': 0.60; 'site': 0.61; 'minutes': 0.62; 'url:twitter': 0.63; 'co.': 0.64; 'organisation': 0.64; 'donations': 0.70; 'follow': 0.70; 'directed': 0.72; 'internet:': 0.72; 'tel.': 0.72; 'twitter:': 0.72; 'much!': 0.84; 'url:donate': 0.84; 'to:addr:info': 0.90; 'url:my': 0.91; 'more,': 0.93; 'url:profile': 0.95 Received: from localhost (HELO mail.python.org) (127.0.0.1) by albatross.python.org with SMTP; 22 Feb 2010 16:56:46 +0100 X-Greylist: delayed 437 seconds by postgrey-1.31 at albatross; Mon, 22 Feb 2010 16:56:46 CET Received: from softguide.de (softguide.de [80.154.37.163]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.python.org (Postfix) with ESMTPS for <mailman-users@python.org>; Mon, 22 Feb 2010 16:56:46 +0100 (CET) Received: (qmail 11569 invoked from network); 22 Feb 2010 16:49:24 +0100 Received: from p54a2f116.dip.t-dialin.net (HELO besitzere4905b) (84.162.241.22) by softguide.de with (RC4-MD5 encrypted) SMTP; 22 Feb 2010 16:49:24 +0100 From: --- REDACTED --- To: <mailman-users@python.org>, <info@fsf.org> Date: Mon, 22 Feb 2010 16:49:21 +0100 Message-ID: <1032461AFEC74EE4B354BB8FBFC2EC03@besitzere4905b> MIME-Version: 1.0 X-Mailer: Microsoft Office Outlook 11 Thread-Index: Acqz1pm7YA7ddlpyT9a7+WHzpMrgrQ= X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Mailman-Approved-At: Tue, 23 Feb 2010 16:11:23 +0100 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.12 Cc: 'Uwe Annuss' <uwe.annuss@softguide.de> Subject: [Mailman-Users] Our donation by paypal - we desire a listing X-BeenThere: mailman-users@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Mailman mailing list management users <mailman-users.python.org> List-Unsubscribe: <http://mail.python.org/mailman/options/mailman-users>, <mailto:mailman-users-request@python.org?subject=unsubscribe> List-Archive: <http://mail.python.org/pipermail/mailman-users> List-Post: <mailto:mailman-users@python.org> List-Help: <mailto:mailman-users-request@python.org?subject=help> List-Subscribe: <http://mail.python.org/mailman/listinfo/mailman-users>, <mailto:mailman-users-request@python.org?subject=subscribe> X-List-Received-Date: Mon, 22 Feb 2010 15:56:47 -0000
Hello GNU-Team,
a few minutes ago we donate your organisation $500 by paypal.
You write on your site https://my.fsf.org/donate/directed-donations/gnumailman: ...For donations of $500 or more, if desired, a listing on our "Thank GNUs" web page.
We desire a listing. Can you list us like "Best Western Hotel Erb" on your site <http://www.gnu.org/software/mailman/index.html> http://www.gnu.org/software/mailman/index.html?
Suggestion:
Thanks go out to: . . . --- REDACTED ---
- Beyer, Clay <clay.beyer@nkadd.org>:
We are setting up a Debian web server and would like to use Mailman to manage a couple of mailing lists that we control. After some initial complications with Mailman and Postfix we decided to uninstall and reinstall everything, before we get going again, I just wanted to get an idea of what the best mail program, taken from the Mailman Documentation, to use with Mailman... Postfix, Qmail, Exim, or Sendmail.
Postfix That's what we're using here at python.org
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On 2/23/2010 7:21 AM, Ralf Hildebrandt wrote:
- Beyer, Clay <clay.beyer@nkadd.org>:
We are setting up a Debian web server and would like to use Mailman to manage a couple of mailing lists that we control. After some initial complications with Mailman and Postfix we decided to uninstall and reinstall everything, before we get going again, I just wanted to get an idea of what the best mail program, taken from the Mailman Documentation, to use with Mailman... Postfix, Qmail, Exim, or Sendmail.
Postfix That's what we're using here at python.org
And with Postfix, use Mailman's Postfix integration for automatic alias generation. Don't use postfix-to-mailman.py.
Exim, with a Mailman router and transport per <http://www.exim.org/howto/mailman21.html> is another good choice for ease of integration with Mailman.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Beyer, Clay wrote:
We are setting up a Debian web server and would like to use Mailman to manage a couple of mailing lists that we control. After some initial complications with Mailman and Postfix we decided to uninstall and reinstall everything, before we get going again, I just wanted to get an idea of what the best mail program, taken from the Mailman Documentation, to use with Mailman... Postfix, Qmail, Exim, or Sendmail.
Personally I would go with Debian's default of Exim and follow the well written instructions in the Debian package file /usr/share/doc/mailman/README.Exim4.Debian, it really is a cut and paste setup. This basically emulates the Postfix virtual style setup, but includes the routers etc to get Exim to see the relevant virtual aliases files.
I have converted three Mailman setups that were using Postfix (on a mix of Ubuntu and FreeBSD) to this setup, and it worked fine. In all these cases this was a separate box managing a bit of web stuff and lists, not the primary email server.
Thanks. Andrew.
On Mon, Feb 22, 2010 at 11:20:05AM -0500, Beyer, Clay wrote:
We are setting up a Debian web server and would like to use Mailman to manage a couple of mailing lists that we control. After some initial complications with Mailman and Postfix we decided to uninstall and reinstall everything, before we get going again, I just wanted to get an idea of what the best mail program, taken from the Mailman Documentation, to use with Mailman... Postfix, Qmail, Exim, or Sendmail.
I strongly recommend against qmail, as it is not suitable for professional or even amateur use.
I have deployed the rest in various environments numerous times. None is "best" across the board, but each may be "best" depending on your environment, your needs, and your mail system knowledge. Roughly, and I emphasize ROUGHLY speaking:
- exim is the simplest to install and configure. If your needs
are straightforward and modest, this might be the best choice.
- postfix is not as simple, but it *is* modular, well-designed,
and quite capable of supporting complex environments.
- sendmail is still more complex, but is widely known (in part
because of its longevity) and there is a larger knowledge base
for it than any other MTA. Sendmail milters offer an extensive
feature set for sufficiently-advanced administrators.
Personally, I tend to use sendmail and postfix -- in part because I'm usually dealing with non-straightforward environments. However, I would advise that if you think can manage with exim, you should at least try.
Without having detailed knowledge of the factors above (environment, needs, knowledge) it's tough to say much further than that, but: if you're only handling mail for one domain, if you're only handling mailing list traffic, if you're not running an associated POP/IMAP server, if you're running a single server, if you're handling low to moderate volumes of traffic, then my guess would be that exim will work very well for you.
---Rsk
Hello,
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
Thank you in advance.
Best regards,
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
Hello Mark,
Thank you very much for your answer. I guess the "cleanest" way would be to override the PUBLIC_EXTERNAL_ARCHIVER (we don't want to index our private for now). I'll give it a try as soon as possible. Do you think my script will interest some people ? if so, where should I post it ?
Thanks again
Best regards,
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
Cedric Jeanneret wrote:
Thank you very much for your answer. I guess the "cleanest" way would be to override the PUBLIC_EXTERNAL_ARCHIVER (we don't want to index our private for now). I'll give it a try as soon as possible. Do you think my script will interest some people ? if so, where should I post it ?
Yes, I think it may be of interest. The best place is the tracker at <https://bugs.launchpad.net/mailman> plus a note to this list.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
Hello again,
Just one question : what do mlist, msg, msgdata stand for ? As I read I've to create my module and define a "process(mlist, msg, msgdata) inside it, I'd like to know what are those objects. I discovered that mlist stands for a Mailman.MailList.MailList('list-name'), but for the others, it's a bit hard to find...
Thanks in advance.
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 2/26/2010 4:20 AM, Cedric Jeanneret wrote:
On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
Hello again,
Just one question : what do mlist, msg, msgdata stand for ? As I read I've to create my module and define a "process(mlist, msg, msgdata) inside it, I'd like to know what are those objects. I discovered that mlist stands for a Mailman.MailList.MailList('list-name'), but for the others, it's a bit hard to find...
Only custom handlers need to define process(mlist, msg, msgdata). That is the entry point to the handler and three objects are passed
mlist is the Mailman.MailList.MailList() instance for the current list
msg is a Mailman.Message.Message() (subclass of email.Message.Message) instance for the current message
msgdata is a dictionary of the message metadata accumulated so far.
The important thing is these are passed in as arguments to the handler process() function.
In your case, you are defining a module which is going to be invoked like the following.
Suppose that
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/myarch.py %(hostname)s %listname)s'
It will be invoked in a pipe similar to
cat raw_message | /path/to/myarch.py HOST LIST
i.e. the command string with %(hostname)s and %listname)s replaced by the actual host name and list name of the list will be invoked and the message piped to it.
So, it could begin something like:
#!python import sys sys.path.insert(0, 'path/to/mailman/bin') # The above line can be skipped if myarch.py is in Mailman's # bin directory. import paths
import email from Mailman import MailList from Mailman import Message
msg = email.message_from_file(sys.stdin, Message.Message) mlist = MailList.MailList(sys.argv[1], lock=True)
At this point, you have a list object (locked) and a message object. You might think you could just do
mlist.ArchiveMail(msg)
to archive the mail to the listname.mbox file and the pipermail archive, but that wouldn't quite work because that method would re-invoke the external archiver. Also, you don't need to worry about the listname.mbox file because the ArchiveMail() method already did that before invoking the external archiver, so what you would need is
from Mailman.Archiver import HyperArch from cStringIO import StringIO f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close()
Which is what the ArchiveMail() method would do. Now you still have the mlist and msg objects, and you need to save and unlock the list at some point
mlist.Save() mlist.Unlock()
and the message is now in the pipermail archive and can be indexed.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Fri, Feb 26, 2010 at 7:15 PM, Mark Sapiro <mark@msapiro.net> wrote:
On 2/26/2010 4:20 AM, Cedric Jeanneret wrote:
On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
Hello again,
Just one question : what do mlist, msg, msgdata stand for ? As I read I've to create my module and define a "process(mlist, msg, msgdata) inside it, I'd like to know what are those objects. I discovered that mlist stands for a Mailman.MailList.MailList('list-name'), but for the others, it's a bit hard to find...
Only custom handlers need to define process(mlist, msg, msgdata). That is the entry point to the handler and three objects are passed
mlist is the Mailman.MailList.MailList() instance for the current list
msg is a Mailman.Message.Message() (subclass of email.Message.Message) instance for the current message
msgdata is a dictionary of the message metadata accumulated so far.
The important thing is these are passed in as arguments to the handler process() function.
In your case, you are defining a module which is going to be invoked like the following.
Suppose that
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/myarch.py %(hostname)s %listname)s'
It will be invoked in a pipe similar to
cat raw_message | /path/to/myarch.py HOST LIST
i.e. the command string with %(hostname)s and %listname)s replaced by the actual host name and list name of the list will be invoked and the message piped to it.
So, it could begin something like:
#!python import sys sys.path.insert(0, 'path/to/mailman/bin') # The above line can be skipped if myarch.py is in Mailman's # bin directory. import paths
import email from Mailman import MailList from Mailman import Message
msg = email.message_from_file(sys.stdin, Message.Message) mlist = MailList.MailList(sys.argv[1], lock=True)
At this point, you have a list object (locked) and a message object. You might think you could just do
mlist.ArchiveMail(msg)
to archive the mail to the listname.mbox file and the pipermail archive, but that wouldn't quite work because that method would re-invoke the external archiver. Also, you don't need to worry about the listname.mbox file because the ArchiveMail() method already did that before invoking the external archiver, so what you would need is
from Mailman.Archiver import HyperArch from cStringIO import StringIO f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close()
Which is what the ArchiveMail() method would do. Now you still have the mlist and msg objects, and you need to save and unlock the list at some point
mlist.Save() mlist.Unlock()
and the message is now in the pipermail archive and can be indexed.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
wow, thanks a lot, with all this I'll be able to do what I want!
I'll post all my stuff as soon as I've done it, hopefully next week :).
Thanks again.
Best regards,
C.
On Fri, 26 Feb 2010 10:15:13 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 2/26/2010 4:20 AM, Cedric Jeanneret wrote:
On Thu, 25 Feb 2010 17:08:06 -0800 Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
I'm trying to create a xapian[1] indexer for our mailing list. As mailman is written in Python and there are python bindings for xapian, I guess I can maybe create a plugin for that. My first question is : is there already such a thing ? I searched on the net, but nothing appeared My second one : can we create a plugin for mailman, if so, where should I go to have some doc ? seems there's nothing in the wiki (http://wiki.list.org/dosearchsite.action?searchQuery.queryString=plugin&searchQuery.spaceKey=conf_all)
Just to explain why I'd like to do that: we already have a xapian search engine in here, indexing a fileserver, request tracker queues and moinmoin wikis... so we'd like to aggregate all our stuff in one app for searching.
This will be quite doable with Mailman 3 which is still in development.
There are problems trying to do this in Mailman 2.1.x. There is a plugin capability of sorts in the form of custom handlers that can be added to the incoming message processing pipeline. See the FAQ at <http://wiki.list.org/x/l4A9>. However, archiving is asynchronous with incoming message processing, so it is not possible for a custom handler to know the URL that will ultimately retrieve the message from the archive.
A different approach which might be workable is to use the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER hooks. If you set
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/script.py' PRIVATE_EXTERNAL_ARCHIVER = '/path/to/script.py'
in mm_cfg.py, then that script will be invoked do do the archiving. The script in turn could invoke the standard pipermail archiving process and then invoke xapian to index the archived message.
Hello again,
Just one question : what do mlist, msg, msgdata stand for ? As I read I've to create my module and define a "process(mlist, msg, msgdata) inside it, I'd like to know what are those objects. I discovered that mlist stands for a Mailman.MailList.MailList('list-name'), but for the others, it's a bit hard to find...
Only custom handlers need to define process(mlist, msg, msgdata). That is the entry point to the handler and three objects are passed
mlist is the Mailman.MailList.MailList() instance for the current list
msg is a Mailman.Message.Message() (subclass of email.Message.Message) instance for the current message
msgdata is a dictionary of the message metadata accumulated so far.
The important thing is these are passed in as arguments to the handler process() function.
In your case, you are defining a module which is going to be invoked like the following.
Suppose that
PUBLIC_EXTERNAL_ARCHIVER = '/path/to/myarch.py %(hostname)s %listname)s'
It will be invoked in a pipe similar to
cat raw_message | /path/to/myarch.py HOST LIST
i.e. the command string with %(hostname)s and %listname)s replaced by the actual host name and list name of the list will be invoked and the message piped to it.
So, it could begin something like:
#!python import sys sys.path.insert(0, 'path/to/mailman/bin') # The above line can be skipped if myarch.py is in Mailman's # bin directory. import paths
import email from Mailman import MailList from Mailman import Message
msg = email.message_from_file(sys.stdin, Message.Message) mlist = MailList.MailList(sys.argv[1], lock=True)
At this point, you have a list object (locked) and a message object. You might think you could just do
mlist.ArchiveMail(msg)
to archive the mail to the listname.mbox file and the pipermail archive, but that wouldn't quite work because that method would re-invoke the external archiver. Also, you don't need to worry about the listname.mbox file because the ArchiveMail() method already did that before invoking the external archiver, so what you would need is
from Mailman.Archiver import HyperArch from cStringIO import StringIO f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close()
Which is what the ArchiveMail() method would do. Now you still have the mlist and msg objects, and you need to save and unlock the list at some point
mlist.Save() mlist.Unlock()
and the message is now in the pipermail archive and can be indexed.
Hello again,
I'm having some troubles with my code. According to what Mark said, I've done this :
#!/usr/bin/env python import sys sys.path.insert(0,'/usr/lib/mailman')
import syslog
syslog.syslog('begin script')
import email from Mailman import MailList from Mailman import Message ## archive part from Mailman.Archiver import HyperArch from cStringIO import StringIO
maillist = sys.argv[2] hostname = sys.argv[1]
msg = email.message_from_file(sys.stdin, Message.Message) syslog.syslog(maillist)
mlist = MailList.MailList(maillist, lock=True)
syslog.syslog('processing archiver') ## let archive it f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close() mlist.Save() mlist.Unlock()
mlist.ArchiveMail(msg)
syslog.syslog('processing indexer') ### coming soon
syslog.syslog('exiting - all ok') sys.exit(0)
"syslog" is for debug purpose only.
And if I send an email on my ML, I have this kind of error:
Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking Mar 02 12:38:33 2010 (28380) File "/var/lib/mailman/scripts/driver", line 250, in <module> Mar 02 12:38:33 2010 (28380) run_main() Mar 02 12:38:33 2010 (28380) File "/var/lib/mailman/scripts/driver", line 110, in run_main Mar 02 12:38:33 2010 (28380) main() Mar 02 12:38:33 2010 (28380) File "/usr/lib/mailman/Mailman/Cgi/admin.py", line 167, in main Mar 02 12:38:33 2010 (28380) mlist.Lock() Mar 02 12:38:33 2010 (28380) File "/usr/lib/mailman/Mailman/MailList.py", line 161, in Lock Mar 02 12:38:33 2010 (28380) self.__lock.lock(timeout) Mar 02 12:38:33 2010 (28380) File "/usr/lib/mailman/Mailman/LockFile.py", line 306, in lock Mar 02 12:38:33 2010 (28380) important=True) Mar 02 12:38:33 2010 (28380) File "/usr/lib/mailman/Mailman/LockFile.py", line 416, in __writelog Mar 02 12:38:33 2010 (28380) traceback.print_stack(file=logf)
This block is spamming my /var/log/mailman/locks
It seems I have a problem with the lockfile...
Any idea ?
Thank you!
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
On Fri, 26 Feb 2010 10:15:13 -0800 Mark Sapiro <mark@msapiro.net> wrote:
At this point, you have a list object (locked) and a message object. You might think you could just do
mlist.ArchiveMail(msg)
to archive the mail to the listname.mbox file and the pipermail archive, but that wouldn't quite work because that method would re-invoke the external archiver. Also, you don't need to worry about the listname.mbox file because the ArchiveMail() method already did that before invoking the external archiver, so what you would need is
from Mailman.Archiver import HyperArch from cStringIO import StringIO f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close()
Which is what the ArchiveMail() method would do. Now you still have the mlist and msg objects, and you need to save and unlock the list at some point
mlist.Save() mlist.Unlock()
and the message is now in the pipermail archive and can be indexed.
Hello again,
I'm having some troubles with my code. According to what Mark said, I've done this :
#!/usr/bin/env python import sys sys.path.insert(0,'/usr/lib/mailman')
import syslog
syslog.syslog('begin script')
import email from Mailman import MailList from Mailman import Message ## archive part from Mailman.Archiver import HyperArch from cStringIO import StringIO
maillist = sys.argv[2] hostname = sys.argv[1]
msg = email.message_from_file(sys.stdin, Message.Message) syslog.syslog(maillist)
mlist = MailList.MailList(maillist, lock=True)
syslog.syslog('processing archiver') ## let archive it f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close() mlist.Save() mlist.Unlock()
mlist.ArchiveMail(msg)
Here is one problem. Remove the above line. As I tried to say above you can't do this. The lines above from "f = StringIO(str(msg))" through "f.close()" archive the message. When you call mlist.ArchiveMail(msg), it reinvokes your external archiver in an endless loop.
You need to remove the mlist.ArchiveMail(msg).
The locking problem is something else. The external archiver is called with the list locked, thus when we try to instantiate the list 'locked', we have a deadlock. Thus, you never saw the loop because of the deadlock.
The good news is we don't have to pass a locked list instance to HyperArch.HyperArchive() as it uses a special archiver lock.
So, replace
mlist = MailList.MailList(maillist, lock=True)
with
mlist = MailList.MailList(maillist, lock=False)
and remove the "mlist.Unlock()" as your instance isn't locked, and ArchRunner will unlock its list instance when you exit.
syslog.syslog('processing indexer') ### coming soon
syslog.syslog('exiting - all ok') sys.exit(0)
"syslog" is for debug purpose only.
And if I send an email on my ML, I have this kind of error:
Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Tue, 02 Mar 2010 11:34:25 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
On Fri, 26 Feb 2010 10:15:13 -0800 Mark Sapiro <mark@msapiro.net> wrote:
At this point, you have a list object (locked) and a message object. You might think you could just do
mlist.ArchiveMail(msg)
to archive the mail to the listname.mbox file and the pipermail archive, but that wouldn't quite work because that method would re-invoke the external archiver. Also, you don't need to worry about the listname.mbox file because the ArchiveMail() method already did that before invoking the external archiver, so what you would need is
from Mailman.Archiver import HyperArch from cStringIO import StringIO f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close()
Which is what the ArchiveMail() method would do. Now you still have the mlist and msg objects, and you need to save and unlock the list at some point
mlist.Save() mlist.Unlock()
and the message is now in the pipermail archive and can be indexed.
Hello again,
I'm having some troubles with my code. According to what Mark said, I've done this :
#!/usr/bin/env python import sys sys.path.insert(0,'/usr/lib/mailman')
import syslog
syslog.syslog('begin script')
import email from Mailman import MailList from Mailman import Message ## archive part from Mailman.Archiver import HyperArch from cStringIO import StringIO
maillist = sys.argv[2] hostname = sys.argv[1]
msg = email.message_from_file(sys.stdin, Message.Message) syslog.syslog(maillist)
mlist = MailList.MailList(maillist, lock=True)
syslog.syslog('processing archiver') ## let archive it f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close() mlist.Save() mlist.Unlock()
mlist.ArchiveMail(msg)
Here is one problem. Remove the above line. As I tried to say above you can't do this. The lines above from "f = StringIO(str(msg))" through "f.close()" archive the message. When you call mlist.ArchiveMail(msg), it reinvokes your external archiver in an endless loop.
You need to remove the mlist.ArchiveMail(msg).
The locking problem is something else. The external archiver is called with the list locked, thus when we try to instantiate the list 'locked', we have a deadlock. Thus, you never saw the loop because of the deadlock.
The good news is we don't have to pass a locked list instance to HyperArch.HyperArchive() as it uses a special archiver lock.
So, replace
mlist = MailList.MailList(maillist, lock=True)
with
mlist = MailList.MailList(maillist, lock=False)
and remove the "mlist.Unlock()" as your instance isn't locked, and ArchRunner will unlock its list instance when you exit.
syslog.syslog('processing indexer') ### coming soon
syslog.syslog('exiting - all ok') sys.exit(0)
"syslog" is for debug purpose only.
And if I send an email on my ML, I have this kind of error:
Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking
Woops, right. it was commented out in my code. For now, I'm pocking around with some other problems, such as my external archiver returns a non-zero status. It seems to crash with the h.processUnixMailbox(f) Is there any way to have a backtrace of python errors (i.e. testing it through the shell)? I guess I can write a file with all email content, included headers, and pipe it in my file. Right ?
Thank you!
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 3/2/2010 11:02 PM, Cedric Jeanneret wrote:
Woops, right. it was commented out in my code. For now, I'm pocking around with some other problems, such as my external archiver returns a non-zero status. It seems to crash with the h.processUnixMailbox(f) Is there any way to have a backtrace of python errors (i.e. testing it through the shell)? I guess I can write a file with all email content, included headers, and pipe it in my file. Right ?
There are several choices.
You could try adding '&>filename' to your external archiver command string. That will probably work
You can do as you suggest above.
You can replace your "import syslog" with
from Mailman.Logging.Syslog import syslog from Mailman.Logging.Utils import LogStdErr
and add
LogStdErr('debug', 'mailmanctl', manual_reprime=0)
and change your syslog.syslog('debug text') statements to
syslog('debug', 'debug text')
This will write all stderr output plus your 'debug text' entries to a log named debug in Mailman's logs directory. (You can name the log anything you want. It will be created if it doesn't exist.)
I see you've gotten further. I'll respond to that post.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Tue, 02 Mar 2010 11:34:25 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/2/2010 3:41 AM, Cedric Jeanneret wrote:
On Fri, 26 Feb 2010 10:15:13 -0800 Mark Sapiro <mark@msapiro.net> wrote:
At this point, you have a list object (locked) and a message object. You might think you could just do
mlist.ArchiveMail(msg)
to archive the mail to the listname.mbox file and the pipermail archive, but that wouldn't quite work because that method would re-invoke the external archiver. Also, you don't need to worry about the listname.mbox file because the ArchiveMail() method already did that before invoking the external archiver, so what you would need is
from Mailman.Archiver import HyperArch from cStringIO import StringIO f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close()
Which is what the ArchiveMail() method would do. Now you still have the mlist and msg objects, and you need to save and unlock the list at some point
mlist.Save() mlist.Unlock()
and the message is now in the pipermail archive and can be indexed.
Hello again,
I'm having some troubles with my code. According to what Mark said, I've done this :
#!/usr/bin/env python import sys sys.path.insert(0,'/usr/lib/mailman')
import syslog
syslog.syslog('begin script')
import email from Mailman import MailList from Mailman import Message ## archive part from Mailman.Archiver import HyperArch from cStringIO import StringIO
maillist = sys.argv[2] hostname = sys.argv[1]
msg = email.message_from_file(sys.stdin, Message.Message) syslog.syslog(maillist)
mlist = MailList.MailList(maillist, lock=True)
syslog.syslog('processing archiver') ## let archive it f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) h.close() f.close() mlist.Save() mlist.Unlock()
mlist.ArchiveMail(msg)
Here is one problem. Remove the above line. As I tried to say above you can't do this. The lines above from "f = StringIO(str(msg))" through "f.close()" archive the message. When you call mlist.ArchiveMail(msg), it reinvokes your external archiver in an endless loop.
You need to remove the mlist.ArchiveMail(msg).
The locking problem is something else. The external archiver is called with the list locked, thus when we try to instantiate the list 'locked', we have a deadlock. Thus, you never saw the loop because of the deadlock.
The good news is we don't have to pass a locked list instance to HyperArch.HyperArchive() as it uses a special archiver lock.
So, replace
mlist = MailList.MailList(maillist, lock=True)
with
mlist = MailList.MailList(maillist, lock=False)
and remove the "mlist.Unlock()" as your instance isn't locked, and ArchRunner will unlock its list instance when you exit.
syslog.syslog('processing indexer') ### coming soon
syslog.syslog('exiting - all ok') sys.exit(0)
"syslog" is for debug purpose only.
And if I send an email on my ML, I have this kind of error:
Mar 02 12:38:33 2010 (28380) toto.lock lifetime has expired, breaking
Hmm, it seems it crashes in pipermail.py, in function processUnixMailbox: we have a pos = input.tell() on line 564, but unfortunately "input" does NOT have any "tell()" method... It returns a "41" status.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 3/3/2010 12:57 AM, Cedric Jeanneret wrote:
On Tue, 02 Mar 2010 11:34:25 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/2/2010 3:41 AM, Cedric Jeanneret wrote: [...]
from cStringIO import StringIO [...] f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) [...]
Hmm, it seems it crashes in pipermail.py, in function processUnixMailbox: we have a pos = input.tell() on line 564, but unfortunately "input" does NOT have any "tell()" method... It returns a "41" status.
Something is strange. The input object in 'pos = input.tell()' is the StringIO instance you passed as 'f', and StringIO objects do have a tell method. Also, the above code snippet is exactly what the builtin archiver uses, and I tested it and it worked for me.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Wed, Mar 3, 2010 at 4:44 PM, Mark Sapiro <mark@msapiro.net> wrote:
On 3/3/2010 12:57 AM, Cedric Jeanneret wrote:
On Tue, 02 Mar 2010 11:34:25 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/2/2010 3:41 AM, Cedric Jeanneret wrote: [...]
from cStringIO import StringIO [...] f = StringIO(str(msg)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) [...]
Hmm, it seems it crashes in pipermail.py, in function processUnixMailbox: we have a pos = input.tell() on line 564, but unfortunately "input" does NOT have any "tell()" method... It returns a "41" status.
Something is strange. The input object in 'pos = input.tell()' is the StringIO instance you passed as 'f', and StringIO objects do have a tell method. Also, the above code snippet is exactly what the builtin archiver uses, and I tested it and it worked for me.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Maybe a python version? What is really strange is that it works inside the archiver.... I tried to NOT use email.message_from_file (so use directly StringIO on sys.stdin), and it worked fine. In fact, the error was that "Message doesn't have "tell()" method"...
Another error was really annoying : ALL worked. almost. I couldn't do my mlist.Save(), as there was an error for the lockfile.
I did : mlist = MailList.MailList('toto', lock=False) # other code mlist.Save()
-> crashed. After poking into MailList code, I saw that it refreshes the lockfile. Commenting out this line made it work again.... more or less : message was in mbox, but wasn't in pipermail archives....
Poking on the Net, I found this post http://www.mail-archive.com/mailman-users@python.org/msg47499.html you answered some months (well, years) ago. I tried this way : applying the patch, so that it uses mailman internal archiver, and it calls my indexer right after. That's not really clean, it's not really a portable way, but it works. The fact that I have to patch a file from mailman package annoy me a bit, but... I didn't have any success with the ways you showed me :(
To be honnest, maybe I'll try to put a handler (like XapianIndexer.py) for this. As I saw how to debug my scripts (thank you for the tip), I guess it would be the best way, instead of patching a code (which will be overriden on the next update).
Or maybe there's a variable in mm_config (or defaults) which tell mailman to call a script after archiving ? I didn't see such a thing, I guess that's the role a the GLOBAL_PIPELINE and its handlers chain...
Thank you for the time you spend on my problem.
Best regards,
C.
On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
Maybe a python version? What is really strange is that it works inside the archiver.... I tried to NOT use email.message_from_file (so use directly StringIO on sys.stdin), and it worked fine. In fact, the error was that "Message doesn't have "tell()" method"...
Which says you are passing a Message object, not a StringIO or file object. I considered at one point just passing sys.stdin directly, but that won't work because sys.stdin does not have seek() or tell() methods.
Another error was really annoying : ALL worked. almost. I couldn't do my mlist.Save(), as there was an error for the lockfile.
I did : mlist = MailList.MailList('toto', lock=False) # other code mlist.Save()
Right. I overlooked the fact that you can't Save() an unlocked list. But, I don't think you need to. I don't think the archiver actually updates your list instance in it's processing, so you should be OK if you just remove the Save() from your code.
-> crashed. After poking into MailList code, I saw that it refreshes the lockfile. Commenting out this line made it work again.... more or less : message was in mbox, but wasn't in pipermail archives....
Don't do that. It won't work anyway because the locked list object in ArchRunner will be saved after you're done and will undo any changes you made to your list object. But, as I say, you shouldn't need to save your list object. It is only passed to the HyperArch.HyperArchive() constructor so the archiver knows where to find the archive. I don't think it is updated.
Poking on the Net, I found this post http://www.mail-archive.com/mailman-users@python.org/msg47499.html you answered some months (well, years) ago. I tried this way : applying the patch, so that it uses mailman internal archiver, and it calls my indexer right after. That's not really clean, it's not really a portable way, but it works. The fact that I have to patch a file from mailman package annoy me a bit, but... I didn't have any success with the ways you showed me :(
To be honnest, maybe I'll try to put a handler (like XapianIndexer.py) for this. As I saw how to debug my scripts (thank you for the tip), I guess it would be the best way, instead of patching a code (which will be overriden on the next update).
Or maybe there's a variable in mm_config (or defaults) which tell mailman to call a script after archiving ? I didn't see such a thing, I guess that's the role a the GLOBAL_PIPELINE and its handlers chain...
As I tried to point out in my initial reply <http://mail.python.org/pipermail/mailman-users/2010-February/068900.html>, that won't work.
The pipeline includes ToArchive which only queues the message in the archive queue for ArchRunner. Then IncomingRunner continues processing the pipeline. When it gets to your handler, there's no guarantee that ArchRunner has yet archived the message so how do you index something that may not yet even be there.
We were almost there with the external archiver method. Let's try to make that work.
What do you have now in the external archiver code and in the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what is the problem?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Wed, 03 Mar 2010 10:04:31 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
Maybe a python version? What is really strange is that it works inside the archiver.... I tried to NOT use email.message_from_file (so use directly StringIO on sys.stdin), and it worked fine. In fact, the error was that "Message doesn't have "tell()" method"...
Which says you are passing a Message object, not a StringIO or file object. I considered at one point just passing sys.stdin directly, but that won't work because sys.stdin does not have seek() or tell() methods.
Another error was really annoying : ALL worked. almost. I couldn't do my mlist.Save(), as there was an error for the lockfile.
I did : mlist = MailList.MailList('toto', lock=False) # other code mlist.Save()
Right. I overlooked the fact that you can't Save() an unlocked list. But, I don't think you need to. I don't think the archiver actually updates your list instance in it's processing, so you should be OK if you just remove the Save() from your code.
-> crashed. After poking into MailList code, I saw that it refreshes the lockfile. Commenting out this line made it work again.... more or less : message was in mbox, but wasn't in pipermail archives....
Don't do that. It won't work anyway because the locked list object in ArchRunner will be saved after you're done and will undo any changes you made to your list object. But, as I say, you shouldn't need to save your list object. It is only passed to the HyperArch.HyperArchive() constructor so the archiver knows where to find the archive. I don't think it is updated.
Poking on the Net, I found this post http://www.mail-archive.com/mailman-users@python.org/msg47499.html you answered some months (well, years) ago. I tried this way : applying the patch, so that it uses mailman internal archiver, and it calls my indexer right after. That's not really clean, it's not really a portable way, but it works. The fact that I have to patch a file from mailman package annoy me a bit, but... I didn't have any success with the ways you showed me :(
To be honnest, maybe I'll try to put a handler (like XapianIndexer.py) for this. As I saw how to debug my scripts (thank you for the tip), I guess it would be the best way, instead of patching a code (which will be overriden on the next update).
Or maybe there's a variable in mm_config (or defaults) which tell mailman to call a script after archiving ? I didn't see such a thing, I guess that's the role a the GLOBAL_PIPELINE and its handlers chain...
As I tried to point out in my initial reply <http://mail.python.org/pipermail/mailman-users/2010-February/068900.html>, that won't work.
The pipeline includes ToArchive which only queues the message in the archive queue for ArchRunner. Then IncomingRunner continues processing the pipeline. When it gets to your handler, there's no guarantee that ArchRunner has yet archived the message so how do you index something that may not yet even be there.
We were almost there with the external archiver method. Let's try to make that work.
What do you have now in the external archiver code and in the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what is the problem?
Hello again,
First of all, I want to thank you for the time you spend on my case. I really appreciate.
Now, for my code: I attached the latest (buggy) version of my archive-and-index.py script. I've done a rollback to the way you told me, so that we won't go in all directions. You'll find anotther attachment : debug file I added in this way : PUBLIC_EXTERNAL_ARCHIVER = '/root/archive-and-index.py %(hostname)s %(listname)s &>/var/log/mailman/archiver'
It seems that the Message.Message stays, even if we create a new StringIO variable... weird. Just in case : python --version Python 2.5.2
Maybe there's a problem with this version... ? If so, it will be a "little" problem, as it's the lenny version.
I'll keep on trying, and keep you updated as soon as I have some new things.
Thanks again.
Best regards,
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On Wed, 03 Mar 2010 10:04:31 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
Maybe a python version? What is really strange is that it works inside the archiver.... I tried to NOT use email.message_from_file (so use directly StringIO on sys.stdin), and it worked fine. In fact, the error was that "Message doesn't have "tell()" method"...
Which says you are passing a Message object, not a StringIO or file object. I considered at one point just passing sys.stdin directly, but that won't work because sys.stdin does not have seek() or tell() methods.
Another error was really annoying : ALL worked. almost. I couldn't do my mlist.Save(), as there was an error for the lockfile.
I did : mlist = MailList.MailList('toto', lock=False) # other code mlist.Save()
Right. I overlooked the fact that you can't Save() an unlocked list. But, I don't think you need to. I don't think the archiver actually updates your list instance in it's processing, so you should be OK if you just remove the Save() from your code.
-> crashed. After poking into MailList code, I saw that it refreshes the lockfile. Commenting out this line made it work again.... more or less : message was in mbox, but wasn't in pipermail archives....
Don't do that. It won't work anyway because the locked list object in ArchRunner will be saved after you're done and will undo any changes you made to your list object. But, as I say, you shouldn't need to save your list object. It is only passed to the HyperArch.HyperArchive() constructor so the archiver knows where to find the archive. I don't think it is updated.
Poking on the Net, I found this post http://www.mail-archive.com/mailman-users@python.org/msg47499.html you answered some months (well, years) ago. I tried this way : applying the patch, so that it uses mailman internal archiver, and it calls my indexer right after. That's not really clean, it's not really a portable way, but it works. The fact that I have to patch a file from mailman package annoy me a bit, but... I didn't have any success with the ways you showed me :(
To be honnest, maybe I'll try to put a handler (like XapianIndexer.py) for this. As I saw how to debug my scripts (thank you for the tip), I guess it would be the best way, instead of patching a code (which will be overriden on the next update).
Or maybe there's a variable in mm_config (or defaults) which tell mailman to call a script after archiving ? I didn't see such a thing, I guess that's the role a the GLOBAL_PIPELINE and its handlers chain...
As I tried to point out in my initial reply <http://mail.python.org/pipermail/mailman-users/2010-February/068900.html>, that won't work.
The pipeline includes ToArchive which only queues the message in the archive queue for ArchRunner. Then IncomingRunner continues processing the pipeline. When it gets to your handler, there's no guarantee that ArchRunner has yet archived the message so how do you index something that may not yet even be there.
We were almost there with the external archiver method. Let's try to make that work.
What do you have now in the external archiver code and in the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what is the problem?
Hello again !
I think I found what's the problem is : the script works now, but as I write my own archiver, it doesn't do the pipermail part (i.e. update mails in archive)... I thought that this code :
mlist = MailList.MailList(maillist, lock=False) msg = email.message_from_file(sys.stdin, Message.Message) f = StringIO(str(sys.stdin)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) f.close()
did all, but after reading a bit of code, it doesn't exactly. It saves to .mbox file, right ?
I tried to find where it does the pipermail stuff, but it's a bit complicated [I'm not so at ease with Python].
Any clue ?
Thank you
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 3/4/2010 4:23 AM, Cedric Jeanneret wrote:
I think I found what's the problem is : the script works now, but as I write my own archiver, it doesn't do the pipermail part (i.e. update mails in archive)... I thought that this code :
mlist = MailList.MailList(maillist, lock=False) msg = email.message_from_file(sys.stdin, Message.Message) f = StringIO(str(sys.stdin)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) f.close()
did all, but after reading a bit of code, it doesn't exactly. It saves to .mbox file, right ?
No. It doesn't save to the .mbox file. If you look at the ArchiveMail() method in Mailman/Archivers/Archiver.py. it first saves to the .mbox by doing
if mm_cfg.ARCHIVE_TO_MBOX in (1, 2):
self.__archive_to_mbox(msg)
Then it either calls the external archiver or executes essentially the above to archive the mail in the pipermail archive.
What you are missing is
h.close()
and that's why it doesn't work.
I tried to find where it does the pipermail stuff, but it's a bit complicated [I'm not so at ease with Python].
Yes, the archiver is very convoluted because classes are subclassed and methods overridden all over. Don't feel bad. I've been looking at it for years and still only barely understand it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Thu, 04 Mar 2010 06:49:54 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/4/2010 4:23 AM, Cedric Jeanneret wrote:
I think I found what's the problem is : the script works now, but as I write my own archiver, it doesn't do the pipermail part (i.e. update mails in archive)... I thought that this code :
mlist = MailList.MailList(maillist, lock=False) msg = email.message_from_file(sys.stdin, Message.Message) f = StringIO(str(sys.stdin)) h = HyperArch.HyperArchive(mlist) h.processUnixMailbox(f) f.close()
did all, but after reading a bit of code, it doesn't exactly. It saves to .mbox file, right ?
No. It doesn't save to the .mbox file. If you look at the ArchiveMail() method in Mailman/Archivers/Archiver.py. it first saves to the .mbox by doing
if mm_cfg.ARCHIVE_TO_MBOX in (1, 2): self.__archive_to_mbox(msg)
Then it either calls the external archiver or executes essentially the above to archive the mail in the pipermail archive.
What you are missing is
h.close()
and that's why it doesn't work.
I tried to find where it does the pipermail stuff, but it's a bit complicated [I'm not so at ease with Python].
Yes, the archiver is very convoluted because classes are subclassed and methods overridden all over. Don't feel bad. I've been looking at it for years and still only barely understand it.
hmmm, I use the h.close() a bit after (I catche its latest ID so that I ca build the direct URL for my indexer). But for now, I guess I'm done. I've opened a bug (didn't figure where I could put my stuff) on launchpad: https://bugs.launchpad.net/mailman/+bug/531942 It contains my scripts, and some informations on how to use them.
Indeed, "arch" script uses locks. I copied it, removed the lock stuff, and used this version. All work fine now.
I'm happy I could understand a bit (well... very little bit) how mailman works.
Thanks again !
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 3/4/2010 7:10 AM, Cedric Jeanneret wrote:
hmmm, I use the h.close() a bit after (I catche its latest ID so that I ca build the direct URL for my indexer). But for now, I guess I'm done. I've opened a bug (didn't figure where I could put my stuff) on launchpad: https://bugs.launchpad.net/mailman/+bug/531942 It contains my scripts, and some informations on how to use them.
I've seen your "bug" in the tracker. It's too bad Launchpad calls everything a bug, but that's the right place.
Indeed, "arch" script uses locks. I copied it, removed the lock stuff, and used this version. All work fine now.
I will have some comments after I look at this more. I think there is redundant stuff, but I'll comment further after I look in detail.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Wed, 03 Mar 2010 10:04:31 -0800 Mark Sapiro <mark@msapiro.net> wrote:
On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
Maybe a python version? What is really strange is that it works inside the archiver.... I tried to NOT use email.message_from_file (so use directly StringIO on sys.stdin), and it worked fine. In fact, the error was that "Message doesn't have "tell()" method"...
Which says you are passing a Message object, not a StringIO or file object. I considered at one point just passing sys.stdin directly, but that won't work because sys.stdin does not have seek() or tell() methods.
Another error was really annoying : ALL worked. almost. I couldn't do my mlist.Save(), as there was an error for the lockfile.
I did : mlist = MailList.MailList('toto', lock=False) # other code mlist.Save()
Right. I overlooked the fact that you can't Save() an unlocked list. But, I don't think you need to. I don't think the archiver actually updates your list instance in it's processing, so you should be OK if you just remove the Save() from your code.
-> crashed. After poking into MailList code, I saw that it refreshes the lockfile. Commenting out this line made it work again.... more or less : message was in mbox, but wasn't in pipermail archives....
Don't do that. It won't work anyway because the locked list object in ArchRunner will be saved after you're done and will undo any changes you made to your list object. But, as I say, you shouldn't need to save your list object. It is only passed to the HyperArch.HyperArchive() constructor so the archiver knows where to find the archive. I don't think it is updated.
Poking on the Net, I found this post http://www.mail-archive.com/mailman-users@python.org/msg47499.html you answered some months (well, years) ago. I tried this way : applying the patch, so that it uses mailman internal archiver, and it calls my indexer right after. That's not really clean, it's not really a portable way, but it works. The fact that I have to patch a file from mailman package annoy me a bit, but... I didn't have any success with the ways you showed me :(
To be honnest, maybe I'll try to put a handler (like XapianIndexer.py) for this. As I saw how to debug my scripts (thank you for the tip), I guess it would be the best way, instead of patching a code (which will be overriden on the next update).
Or maybe there's a variable in mm_config (or defaults) which tell mailman to call a script after archiving ? I didn't see such a thing, I guess that's the role a the GLOBAL_PIPELINE and its handlers chain...
As I tried to point out in my initial reply <http://mail.python.org/pipermail/mailman-users/2010-February/068900.html>, that won't work.
The pipeline includes ToArchive which only queues the message in the archive queue for ArchRunner. Then IncomingRunner continues processing the pipeline. When it gets to your handler, there's no guarantee that ArchRunner has yet archived the message so how do you index something that may not yet even be there.
We were almost there with the external archiver method. Let's try to make that work.
What do you have now in the external archiver code and in the PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what is the problem?
uho, found it !! mailman/bin/arch toto
I guess that's all :))
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
On 3/4/2010 4:46 AM, Cedric Jeanneret wrote:
uho, found it !! mailman/bin/arch toto
I guess that's all :))
You may or may not be able to use bin/arch, but you can't use it in conjunction with an external archiver because of list locking. If you call bin/arch from your external archiver and wait for it to return, you will have a deadlock, and if you don't wait, it won't run until after your external archiver finishes.
I.e., an external archiver command like
'|/path/bin/arch $(listname)s;/path/myscript.py $(listname)s'
creates a deadlock, and one like
'|/path/bin/arch $(listname)s&/path/myscript.py $(listname)s'
doesn't work because myscript.py has to complete before bin/arch can obtain the list lock.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
To follow up on this thread, there is now a FAQ at <http://wiki.list.org/x/RAKJ> which contains an attached template, Ext_Arch.py, which can be used as an external archiver and which will add the message to the pipermail archive, and then call a stub function with arguments of the list name, host name, the URL to the just archived message, the file system path to the just archived message and the message object. The stub can be coded to call a search indexer or do other things one may wish to do with the archived message.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Sun, 14 Mar 2010 17:38:16 -0700 Mark Sapiro <mark@msapiro.net> wrote:
To follow up on this thread, there is now a FAQ at <http://wiki.list.org/x/RAKJ> which contains an attached template, Ext_Arch.py, which can be used as an external archiver and which will add the message to the pipermail archive, and then call a stub function with arguments of the list name, host name, the URL to the just archived message, the file system path to the just archived message and the message object. The stub can be coded to call a search indexer or do other things one may wish to do with the archived message.
Hello Mark,
It just works like a magic!. Thank you so much!
Maybe we should delete my ""bug" on launchpad, or directly link it to your FAQ page ?
I just added my code in the function, and now it indexes, and archives correctly.
Thanks again!
See you
C.
-- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret@camptocamp.com | PSE-A / EPFL
Cedric Jeanneret wrote:
Maybe we should delete my ""bug" on launchpad, or directly link it to your FAQ page ?
I just added my code in the function, and now it indexes, and archives correctly.
I suggest you just delete the two existing attachments and attach your current code with a note that it is based on the template in the FAQ.
That way the xappy/Xapian code will be available there if others wish to use it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Done for launchpad. thanks again!
On Mon, Mar 15, 2010 at 5:40 PM, Mark Sapiro <mark@msapiro.net> wrote:
Cedric Jeanneret wrote:
Maybe we should delete my ""bug" on launchpad, or directly link it to your FAQ page ?
I just added my code in the function, and now it indexes, and archives correctly.
I suggest you just delete the two existing attachments and attach your current code with a note that it is based on the template in the FAQ.
That way the xappy/Xapian code will be available there if others wish to use it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (7)
-
Andrew Hodgson
-
Beyer, Clay
-
Cedric Jeanneret
-
Cédric Jeanneret
-
Mark Sapiro
-
Ralf Hildebrandt
-
Rich Kulawiec