[Mailman-Developers] Storing extra data during pipeline processing

Mark Sapiro mark at msapiro.net
Thu May 7 23:31:01 CEST 2015


On 05/07/2015 01:51 PM, Juraj Variny wrote:
> 
> I am adapting the GPG patch for mailman 2.1.x for our project and want to gather 
> some related data in Approve and Scrubber handlers and use them in the archiver. 
> Can you please advise me how to attach some metadata to a message to use it in 
> later stages? 


Store it in the msgdata which is intended exactly for this purpose and
is passed as a separate object in the queue entry when the message is
queued for downstream runners. Except this may not work - see below.


> Also noticed Scrubber being called multiple times per message (from normal pipeline, 
> from digester, from archiver). I want to verify GPG signatures of the attachments in 
> scrubber and redoing it multiple time is wasteful of system resources, I want  to attach 
> some metadata after the first checking to prevent it, too.


Yes. The scrubber can actually process the same message more than once,
but never more than twice. The purpose of the scrubber is to flatten the
message to plain text and store aside any message parts that can't be
converted to plain text. This must be done for both the pipermail
archive and for the plain format digest. Since archiving and digesting
are separate asynchronous processes, scrubbing is normally done twice;
once in each process. Also, the two processes are independent and
asynchronous so either one may process a given message before the other.

You can set scrub_nondigest to Yes, in which case, scrubbing is done in
the incoming pipeline and has nothing to do when called during digesting
or archiving. This may or may not be desirable depending on the list
because even message and MIME digest subscribers receive a scrubbed message.


> I have tried already to use msgdata parameter or adding headers to message itself, 
> but was not successful so far. I was thinking about adding external database and 
> putting the data there by message ID, but sure there must be a better way? I hope to 
> publish the code some day, too.


The msgdata metadata doesn't work for passing message data from the
incoming pipeline to the digest process, because at the time the
digester is processing and maybe scrubbing messages for the digest, it
is reading the messages from digest.mbox and there is no metadata.
Adding headers to the message in Handlers before ToDigest should work.

ArchRunner does have the metadata when processing a message for the
archive, but it doesn't pass it to the archiver.

But, if you are using Scrubber.process to do the GPG stuff, it probably
won't do anything if scrub_nondigest is No, and then because archiving
and digesting are working with different copies of the message, they
can't communicate via message headers either.

I suggest you look at verifying signatures in Scrubber.process prior to
the point at which at which it returns if scrub_nondigest is No, or
better still, just add a custom handler between MimeDel and Scrubber in
the pipeline to verify sigs and set the result in a message header that
can be used by later processes. (See <http://wiki.list.org/x/4030615>).

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Developers mailing list