[Mailman-Users] suppress duplicate when posting addressed tolistand its alias name

Mark Sapiro mark at msapiro.net
Tue Nov 6 20:26:40 CET 2012


Sahil Tandon wrote:
>
>Thanks Mark, this seems like the ideal approach.  I'll try to hack
>something together borrowing from the various handlers (namely
>AvoidDuplicates.py) that are already in use.


Actually, AvoidDuplicates.py ccould serve as a good example, but it is
currently not actually used. It is experimental and is bot included in
the default GLOBAL_PIPELINE.


>If I can understand how
>Mailman keeps the in-memory dictionary of Message-IDs mentioned in
>AvoidDuplicates.py, and implement an analogue for our use-case, that
>would do it.


The major problem with keeping these data in-memory other than purging
"old" entries so that the dictionary doesn't grow too large, is that
in-memory data aren't shared between runners so if the incoming queue
is sliced, the multiple copies of IncomingRunner do not have access to
each other's data.

In your case, the input to the hash on which runners are sliced
includes all the message headers and the listname so it is likely that
the "equivalent but different" listname messages will be in different
slices of the hash space.

This is not a concern if IncomingRunner is not sliced. It is also not a
concern with a disk based cache as long as buffers are flushed after
writing because IncomingRunner locks the list whose message is being
processed which should prevent race conditions between different
slices of IncomingRunner.


>The goal is to check whether a tuple of (message-id,
>listname) already exists in the dict and, if it does, raise
>Errors.DiscardMessage; otherwise, add the tuple to the dict and do
>nothing.


I would make a dictionary keyed on message-id + the cannonical listname
with value = the time seen. Then I could just check if the key for the
current message exists and proceed as above, and I also have time
stamps so I can periodically remove old entries.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list