[Mailman-Users] UTF-8 From and Reply-to addresses not getting properly processed.

Lindsay Haisley fmouse at fmp.com
Sun Feb 16 02:20:47 EST 2020

On Sat, 2020-02-15 at 20:00 -0800, Mark Sapiro wrote:
> On 2/15/20 5:58 PM, Lindsay Haisley wrote:
> > We're running Mailman 2.1.18-1 and have a list which is having a porn
> > spam problem. The list is set to discard posts from non-members, and
> > the list moderator has set various filters to try to filter on words
> > which contain "f***", as many do, however the Subject, From and Reply-
> > to addresses are all UTF-8 strings, and are apparently confusing
> > Mailman's decision-making functions, and these posts are ending up in
> > the administrative requests list.  Here's a sample set of headers:
> Exactly what filters are used?

The only filter relevant to this issue is "(?i)Subject: .*[fuck]". It
apparently isn't working, or the syntax isn't proper (although the re
syntax looks OK to me. I didn't put it there, the list admin/moderator

> header_filter_rules will RFC 2047 decode the headers.

We don't know what to put into the rules for From and Reply-to since
these are encoded in the message detail, as they are in the displayed
headers. And even if the from headers could be put here, it is, as I
said, a game of whack-a-mole.

The held message page section header is no help, i.e.

                              Held Messages


> mm_cfg.KNOWN_SPAMMERS and bounce_matching_headers do not, but since
> bounce_matching_headers only holds the message, I'm guessing you aren't
> using that, and since list owners can't set mm_cfg.KNOWN_SPAMMERS, I'm
> guessing you aren't using that either.

I run the system, and have access to mm_cfg, so I can put what's
necressary there, But I would assume one would have to match decoded
>From headers, and all these headers in the held posts are encoded.
They're mostly different from message to message, so in any event
blocking by From header, either encoded or decoded is as I said an
exercise in whack-a-mole.

> > MM is properly decoding the Subject in the message detail headers, but
> > not the From address.
> > 
> > Is there any way to get these get Mailman to properly handle these?
> If the only issue is the From: or other sender header, Mailman doesn't
> RFC 2047 decode those in trying to determine if the sender is a member,
> but what's the issue? If you are trying to match a specific address in
> discard_these_nonmembers, I see the problem, but you can discard them by
> setting generic_nonmember_action to discard.

This _is_ how it's set.

> If you only want to discard non-member posts with RFC 2047 encoded
> From:, you could put something like
> ^[^@]+@[a-z0-9_.]+$

> in hold_these_nonmembers to hold the ones that at least don't have
> base64 encoded From:

The list manager has set generic_nonmember_action to Discard, which
should be the last word, _unless_ the _decoded_ from address shows up
in some other place such that the message is held for approval rather
than discarded outright, which is the desired action.
generic_nomemember_action only comes into play if "no explicit action
is defined", so perhaps there's a match somewhere, but again, the From
headers are encoded in the held message detail, so it's hard to tell.
The offending poster may even have joined the list (the list moderator
has default_member_moderation turned on.) 

Is there function or class method in the Python code which can be used
to decode these headers? As you may recall, I'm somewhat Python
literate - actually a minor contributor to the MM 2 code base :)

I also looked at bounce_matching_headers. The explanation and name on
this setting is ambiguous since the name implies a full bounce, but the
explanation says that posts are "held" [for moderation].
Lindsay Haisley       |  "The arc of history is long, but
FMP Computer Services |     it bends toward Justice"
512-259-1190          |
http://www.fmp.com    |        - Barack Obama

More information about the Mailman-Users mailing list