Re: [Mailman-Users] Indexing mail right after delivery

March 4, 2010


      On Wed, 03 Mar 2010 10:04:31 -0800
Mark Sapiro <mark@msapiro.net> wrote:
...
On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
...
Maybe a python version? What is really strange is that it works inside
the archiver.... I tried to NOT use email.message_from_file (so use
directly StringIO on sys.stdin), and it worked fine. In fact, the
error was that "Message doesn't have "tell()" method"...
Which says you are passing a Message object, not a StringIO or file
object. I considered at one point just passing sys.stdin directly, but
that won't work because sys.stdin does not have seek() or tell() methods.
...
Another error was really annoying : ALL worked. almost. I couldn't do
my mlist.Save(), as there was an error for the lockfile.
I did :
mlist = MailList.MailList('toto', lock=False)
# other code
mlist.Save()
Right. I overlooked the fact that you can't Save() an unlocked list.
But, I don't think you need to. I don't think the archiver actually
updates your list instance in it's processing, so you should be OK if
you just remove the Save() from your code.
...
-> crashed. After poking into MailList code, I saw that it refreshes
the lockfile. Commenting out this line made it work again.... more or
less : message was in mbox, but wasn't in pipermail archives....
Don't do that. It won't work anyway because the locked list object in
ArchRunner will be saved after you're done and will undo any changes you
made to your list object. But, as I say, you shouldn't need to save your
list object. It is only passed to the HyperArch.HyperArchive()
constructor so the archiver knows where to find the archive. I don't
think it is updated.
...
Poking on the Net, I found this post
http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
answered some months (well, years) ago. I tried this way :
applying the patch, so that it uses mailman internal archiver, and it
calls my indexer right after.
That's not really clean, it's not really a portable way, but it works.
The fact that I have to patch a file from mailman package annoy me a
bit, but... I didn't have any success with the ways you showed me :(
To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
for this. As I saw how to debug my scripts (thank you for the tip), I
guess it would be the best way, instead of patching a code (which will
be overriden on the next update).
Or maybe there's a variable in mm_config (or defaults) which tell
mailman to call a script after archiving ? I didn't see such a thing,
I guess that's the role a the GLOBAL_PIPELINE and its handlers
chain...
As I tried to point out in my initial reply
<http://mail.python.org/pipermail/mailman-users/2010-February/068900.html>,
that won't work.
The pipeline includes ToArchive which only queues the message in the
archive queue for ArchRunner. Then IncomingRunner continues processing
the pipeline. When it gets to your handler, there's no guarantee that
ArchRunner has yet archived the message so how do you index something
that may not yet even be there.
We were almost there with the external archiver method. Let's try to
make that work.
What do you have now in the external archiver code and in the
PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what
is the problem?
Hello again !
I think I found what's the problem is :
the script works now, but as I write my own archiver, it doesn't do the pipermail part (i.e. update mails in archive)... I thought that this code :
mlist = MailList.MailList(maillist, lock=False)
msg = email.message_from_file(sys.stdin, Message.Message)
f = StringIO(str(sys.stdin))
h = HyperArch.HyperArchive(mlist)
h.processUnixMailbox(f)
f.close()
did all, but after reading a bit of code, it doesn't exactly. It saves to .mbox file, right ?
I tried to find where it does the pipermail stuff, but it's a bit complicated [I'm not so at ease with Python].
Any clue ?
Thank you
--
Cédric Jeanneret                 |  System Administrator
021 619 10 32                    |  Camptocamp SA
cedric.jeanneret@camptocamp.com  |  PSE-A / EPFL

Re: [Mailman-Users] Indexing mail right after delivery

Cedric Jeanneret