[Mailman-Users] Mapping back from pipermail HTML to a specific message in the raw archive?

Mark Sapiro mark at msapiro.net
Thu May 21 04:38:49 CEST 2015


On 05/20/2015 06:56 PM, Skip Montanaro wrote:
> I have a list of spam HTML messages in pipermail archives. I need to clear
> out their content (no great problem there), but I also want to clean the
> corresponding messages in the raw mbox file (zap subject, message body,
> etc, but leave a placeholder message so future archive regeneration doesn't
> mess up article numbers). Looking at one of these messages (HTML source), I
> see nothing like a message id which would allow me to unambiguously
> identify the corresponding raw message. Does something exist? If not, what
> heuristics have people developed to perform this mapping?


The poster's address in the HTML is a link that looks like:

> <A HREF="mailto:list%40example.com?Subject=Re%3A%20%Actual_Subject&In-Reply-To=%3CCAKmAgbSRpqwRU1sR8ij36psvSUyrXMWv-AcEVp%3D1%2BCWRZHh4Rg%40mail.gmail.com%3E"
>        TITLE="Actual_Subject">poster at example.com
>        </A><BR>

The In-Reply-To fragment is the Message-ID.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list