[Mailman-Users] diacritics in from header with from_is_list set to munge

Mark Sapiro mark at msapiro.net
Thu Jan 21 12:59:40 EST 2016


On 01/21/2016 08:06 AM, Stephen J. Turnbull wrote:
> gabriel writes:
> 
>  > so the message of users getting bounced look like (abbreviated):
> 
>  > This is a delivery status notification from some.server.org,
>  > running the Courier mail server, version 0.75.0.
> 
> FYI, bounce messages may or may not be useful, as some bounce programs
> do mess with the mail they forward.  I know you probably can't do
> anything about this, this is the best you can do.


Agreed. I'm not interested in the bounce at all.


>  > From: =?utf-8?q?Val=C3=A9rie/Something_via_mylist_=3Cmyli?=,
>  >       =?utf-8?b?ZW5AbGlzdHMubXRtZWRpYS5vcmc+?=


This is an absolute, non-compliant mess.

The first encoded word, if I ignore the comma which is non-compliant,
decodes to "Valérie/Something via mylist <myli" and the second encoded
word decodes to "st at lists.xxxxx.org>". Thus, if I put them together, I
get "Valérie/Something via mylist <mylist at lists.xxxxx.org>"

(I've replaced the actual last bit of the list name list and part of the
domain with st at lists.xxxxx since you seem to not want to reveal it, even
though you have as anyone can decode the RFC2047 encoding.)

The comma at the end of the first line is wrong because of RFC2047, sec
5(1):

    Ordinary ASCII text and 'encoded-word's may appear together in the
    same header field.  However, an 'encoded-word' that appears in a
    header field defined as '*text' MUST be separated from any adjacent
    'encoded-word' or 'text' by 'linear-white-space'.

More importantly, RFC2047, sec 5(3) says in part:

   + An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'.

>  > Sender: "mylist" <mylist-bounces at lists.mydomain.org>
> 
> So this has already been through Mailman.  We really really need to
> see the mail as it was *before* Mailman handled it (possibly in the
> mbox file in the archive, if you have it).
>
> And then you've redacted stuff, and that may matter.  If you don't
> want to send unredacted headers to a list with public archives, we
> understand, but in that case you can and should send them to Mark (and
> possibly me, but Mark is the real expert if you really want to send it
> to the fewest people) privately.


What I would like to see, unmunged, sent directly to me off list if you
don't want to post it, is
1) The complete, raw headers from the message as received from the list, and
2) Either the complete raw headers of the message from the archive
listname.mbox/listname.mbox file[1] or if that's not possible, from the
archive "Downloadable version .txt (or .txt.gz) file.


> I don't think this is a Mailman bug.  Mailman would not choose to send
> using two different transfer encodings (Q in the first line, B in the
> second).  So I suspect Mailman is just forwarding the garbage it
> receives, or something downstream of Mailman is doing it.


I'm certain Mailman did not create that encoded header. I suspect the
outgoing MTA. This might in fact be precipitated by a Mailman bug; i.e.,
the fact I noted earlier in this thread that the header created by
Mailman can contain a non-ascii character. This might be what triggers
the outgoing MTA to arbitrarily encode the header without actually
parsing it and encoding it correctly, but I'll know more after I see
what I've asked for.

[1] You can get the listname.mbox/listname.mbox file via the web UI.
There may be a link on the archive table of contents page, but usually
there isn't. If there isn't a link, go to the private archive URL (even
if the archive is public) - something like
http://www.example.com/mailman/private/listname - and log in. Then
retrieve http://www.example.com/mailman/private/listname.mbox/listname.mbox

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list