Mailman 3 gmail marks mailman confirmation mail as spam... - Mailman-Users

newer
Can I enforce secure admin...

gmail marks mailman confirmation mail as spam...

older
Administrivia

Kārlis Repsons

June 8, 2009

8:41 p.m.

Hi, maybe you have some recipe for making gmail treat confirmation mails as non-spam? It just throws mail "confirm e8492f19d7c336341050..".

Kārlis Repsons

Attachments:

signature.asc (application/pgp-signature — 198 bytes)

Show replies by date

Mark Sapiro

June 2009

2:45 a.m.

Kālis Repsons wrote:

...

Confirmations are sent with

Precedence: bulk

which may be part of the problem, but I just tested a confirmation to a gmail.com address and it went to the inbox. As far as I know, I have no special spam whitelisting in effect on this gmail account.

The one thing that might be different is the server I sent this from has

VERP_CONFIRMATIONS = Yes

in mm_cfg.py which changes the subject from

confirm 6e4cfe0ab337729574b1a643a231569ef0ef59ab

Your confirmation is required to join the LISTNAME mailing list

and the From: from

LISTNAME-request@example.com

LISTNAME-confirm+6e4cfe0ab337729574b1a643a231569ef0ef59ab@example.com

So you might try setting VERP_CONFIRMATIONS = Yes if your MTA can properly deliver to an address such as above. That may help.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Kārlis Repsons

6:48 p.m.

On Friday 12 June 2009 19:45:25 you wrote:

...

And there is one more thing bothering me: look there: http://www.trikata.com/pipermail/test/2009-June/thread.html same word "nogādāt" was posted in all of the cases, where those terrible characters appear! Maybe you know whats wrong?

Kārlis Repsons

Mark Sapiro

10:12 p.m.

Kārlis Repsons wrote:

...

There are various things that could be different. My server publishes SPF records. That may make a difference. My server may have a better reputation with Google/gmail than yours. See the FAQ at <http://wiki.list.org/x/4oA9>. This is something you'll have to pursue with Google/gmail.

...

The string "=?utf-8?q?_nog=C4=81d=C4=81t?=" is an RFC2047 encoding of the string " nogādāt"

The actual raw header in your archive (at least for one of these) contains three RFC 2047 encoded pieces. The first is "=?utf-8?q?skatamies=2C_cik_ilgi_google_m=C4=93=C4=A3ina_=3D?=" and decodes to "skatamies, cik ilgi google mēģina =". The second is "=?utf-8?b?P3V0Zi04P3E/X25vZz1DND04MWQ9QzQ9ODF0Pz0sIGthZCBuZXZhci4u?=" and decodes to "?utf-8?q?_nog=C4=81d=C4=81t?=, kad nevar..". The last is "=?utf-8?q?=2E?=" and decodes to "."

If I had to guess, I'd say that the original subject got mis-folded by something and the initial "=" of "=?utf-8?q?_nog=C4=81d=C4=81t?=" got separated from the rest by a line continuation, and then the remaining "?utf-8?q?_nog=C4=81d=C4=81t?=" was treated as text rather than an endoded string

The problem may be with the MUA that composed the mail or it may be with Mailman's adding the subject_prefix. I think I'd need to see the raw message as sent to the list to know for sure.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

12:12 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Mark Sapiro wrote:

...

Kārlis forwarded an email to me off list. It's salient feature is the subject header

Subject: skatamies, cik ilgi google =?utf-8?q?m=C4=93=C4=A3ina?= =?utf-8?q?_nog=C4=81d=C4=81t?=, kad nevar...

which is wrapped here but was all one line in the original. I have verified that there is a problem in the underlying Python email package with headers containing multiple RFC 2047 encoded words whether or not they are separated by non-encoded text.

It appears the only the first encoded word is properly decoded resulting in garbled headers in the archive and digests and in messages too if the subject is prefixed. I will follow up when I know more.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Kārlis Repsons

12:18 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

On Sunday 14 June 2009 17:12:22 you wrote:

...

-- Kārlis Repsons

Mark Sapiro

12:39 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Kārlis Repsons wrote:

...

On Sunday 14 June 2009 17:12:22 you wrote:

...

Actually, the problem is not multiple encoded words. It is the fact that Python's email.header.decode_header() function doesn't recognize an RFC 2047 encoded word as such if the trailing "?=" is not followed by whitespace or the end of the string - here it is followed by a ",".

I think this is a bug in decode_header(), but I won't have time to look further at this until tomorrow.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

9:36 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Mark Sapiro wrote:

...

I think there is a minor bug in decode_header() in that it won't recognize a RFC 2047 encoded word in a comment if the encoded word is not separated by whitespace from the ")" that terminates the comment. However, this is the only place where an encoded word need not be followed by whitespace or the end of the header.

The Subject: header above is non-compliant in two respects. It is too long. RFC 2047 section 2 says in part:

While there is no limit to the length of a multiple-line header field, each line of a header field that contains one or more 'encoded-word's is limited to 76 characters.

However, decode_header will accept it anyway and do the right thing. The real problem is item (1) in section 5 of the RFC says in part:

Ordinary ASCII text and 'encoded-word's may appear together in the
same header field.  However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.

The header above does not comply with this. Instead of being

Subject: skatamies, cik ilgi google =?utf-8?q?m=C4=93=C4=A3ina?= =?utf-8?q?_nog=C4=81d=C4=81t?=, kad nevar...

(all on one line), it should be

Subject: skatamies, cik ilgi google =?utf-8?q?m=C4=93=C4=A3ina?= =?utf-8?q?_nog=C4=81d=C4=81t,?= kad nevar...

I.e., it should be folded so no part is longer than 76 characters, but more importantly for this, the "," near the end should be part of the encoded word rather than following the "?=" with no intervening whitespace.

This is a problem with the MUA (mail client) that encoded the Subject: header in the first place.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Stephen J. Turnbull

10:47 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Mark Sapiro writes:

...

I think there is a minor bug in decode_header() in that it won't recognize a RFC 2047 encoded word in a comment if the encoded word is not separated by whitespace from the ")" that terminates the comment. However, this is the only place where an encoded word need not be followed by whitespace or the end of the header.

Indeed that's a bug. I gather that you're saying that this bug is not the cause of the OP's problem, though?

...

The Subject: header above is non-compliant in two respects. It is too long. [...] However, decode_header will accept it anyway and do the right thing.

As it should, according to the Postel Principle. Anyway, IIRC the length limit is a SHOULD NOT, not a MUST NOT, right?

...

real problem is item (1) in section 5 of the RFC says in part:

Ordinary ASCII text and 'encoded-word's may appear together in the
same header field.  However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.

The header above does not comply with this.

Agreed, but I think that by default[1] email should try to parse this header as the user intended it. It's not like encoded-words are that easy to confuse with intended text; it's unlikely that changing 'linear-white-space' above to 'linear-white-space or specials' would harm anyone.

...

This is a problem with the MUA (mail client) that encoded the Subject: header in the first place.

Agreed, but I think following the Postel Principle here is likely to do less harm than adhering strictly to the RFC.

That said, I'm not in a position to contribute code, and this is a pretty invasive change, so the user is unlikely to see a version of Mailman that handles this any time soon. They are likely to have more luck switching clients.

Footnotes: [1] Ie, there should be an option to be strict.

Mark Sapiro

12:17 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

I am trying to move this thread to email-sig@python.org since the underlying issue is in the email package. Further, since as of Mailman 2.1.12, we no longer install a Mailman specific version of the email package, it really has to be addressed in the email package.

Stephen J. Turnbull wrote:

...

Correct.

...

The RFC (8|28|53)22 limits are MUST BE <= 998 and SHOULD BE <= 78. RFC 2047 seems to want to impose stricter limits on encoded words, but unfortunately does not use the defined terms MUST and SHOULD. Section 2 says in part:

An 'encoded-word' may not be more than 75 characters long, including 'charset', 'encoding', 'encoded-text', and delimiters. If it is desirable to encode more text than will fit in an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be used.

While there is no limit to the length of a multiple-line header field, each line of a header field that contains one or more 'encoded-word's is limited to 76 characters.

so it is not clear whether these are 'recommendations' or 'requirements'. In any case, email.header.decode_header() is not enforcing any limits so we are being generous in what we accept in this respect.

...

I fully agree. There is a regexp (ecre) in email/header.py that ends with the lookahead assertion "(?=[ \t]|$)". Even in "strict mode", I think the lookahead needs to accept ")" as well as space and tab, but I think by default, it should just be removed.

...

I agree here too, and note that some MUAs (all three I tried including mutt and Thunderbird) decode the original header as intended.

...

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

June 2009

2:45 a.m.

Kālis Repsons wrote:

...

Confirmations are sent with

Precedence: bulk

The one thing that might be different is the server I sent this from has

VERP_CONFIRMATIONS = Yes

in mm_cfg.py which changes the subject from

confirm 6e4cfe0ab337729574b1a643a231569ef0ef59ab

Your confirmation is required to join the LISTNAME mailing list

and the From: from

LISTNAME-request@example.com

LISTNAME-confirm+6e4cfe0ab337729574b1a643a231569ef0ef59ab@example.com

So you might try setting VERP_CONFIRMATIONS = Yes if your MTA can properly deliver to an address such as above. That may help.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Kārlis Repsons

6:48 p.m.

On Friday 12 June 2009 19:45:25 you wrote:

...

And there is one more thing bothering me: look there: http://www.trikata.com/pipermail/test/2009-June/thread.html same word "nogādāt" was posted in all of the cases, where those terrible characters appear! Maybe you know whats wrong?

Kārlis Repsons

Mark Sapiro

10:12 p.m.

Kārlis Repsons wrote:

...

The string "=?utf-8?q?_nog=C4=81d=C4=81t?=" is an RFC2047 encoding of the string " nogādāt"

The problem may be with the MUA that composed the mail or it may be with Mailman's adding the subject_prefix. I think I'd need to see the raw message as sent to the list to know for sure.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

12:12 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Mark Sapiro wrote:

...

Kārlis forwarded an email to me off list. It's salient feature is the subject header

Subject: skatamies, cik ilgi google =?utf-8?q?m=C4=93=C4=A3ina?= =?utf-8?q?_nog=C4=81d=C4=81t?=, kad nevar...

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Kārlis Repsons

12:18 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

On Sunday 14 June 2009 17:12:22 you wrote:

...

-- Kārlis Repsons

Mark Sapiro

12:39 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Kārlis Repsons wrote:

...

On Sunday 14 June 2009 17:12:22 you wrote:

...

I think this is a bug in decode_header(), but I won't have time to look further at this until tomorrow.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Mark Sapiro

June 2009

9:36 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Mark Sapiro wrote:

...

The Subject: header above is non-compliant in two respects. It is too long. RFC 2047 section 2 says in part:

While there is no limit to the length of a multiple-line header field, each line of a header field that contains one or more 'encoded-word's is limited to 76 characters.

However, decode_header will accept it anyway and do the right thing. The real problem is item (1) in section 5 of the RFC says in part:

Ordinary ASCII text and 'encoded-word's may appear together in the
same header field.  However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.

The header above does not comply with this. Instead of being

Subject: skatamies, cik ilgi google =?utf-8?q?m=C4=93=C4=A3ina?= =?utf-8?q?_nog=C4=81d=C4=81t?=, kad nevar...

(all on one line), it should be

Subject: skatamies, cik ilgi google =?utf-8?q?m=C4=93=C4=A3ina?= =?utf-8?q?_nog=C4=81d=C4=81t,?= kad nevar...

This is a problem with the MUA (mail client) that encoded the Subject: header in the first place.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Stephen J. Turnbull

10:47 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Mark Sapiro writes:

...

I think there is a minor bug in decode_header() in that it won't recognize a RFC 2047 encoded word in a comment if the encoded word is not separated by whitespace from the ")" that terminates the comment. However, this is the only place where an encoded word need not be followed by whitespace or the end of the header.

Indeed that's a bug. I gather that you're saying that this bug is not the cause of the OP's problem, though?

...

The Subject: header above is non-compliant in two respects. It is too long. [...] However, decode_header will accept it anyway and do the right thing.

As it should, according to the Postel Principle. Anyway, IIRC the length limit is a SHOULD NOT, not a MUST NOT, right?

...

real problem is item (1) in section 5 of the RFC says in part:

Ordinary ASCII text and 'encoded-word's may appear together in the
same header field.  However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.

The header above does not comply with this.

...

This is a problem with the MUA (mail client) that encoded the Subject: header in the first place.

Agreed, but I think following the Postel Principle here is likely to do less harm than adhering strictly to the RFC.

Footnotes: [1] Ie, there should be an option to be strict.

Mark Sapiro

12:17 a.m.

New subject: Garbled headers - was: gmail marks mailman confirmation mail as spam...

Stephen J. Turnbull wrote:

...

Correct.

...

While there is no limit to the length of a multiple-line header field, each line of a header field that contains one or more 'encoded-word's is limited to 76 characters.

...

I agree here too, and note that some MUAs (all three I tried including mutt and Thunderbird) decode the original header as intended.

...

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

5751

Age (days ago)

5758

Last active (days ago)

List overview

Download

9 comments

3 participants

participants (3)

Kārlis Repsons
Mark Sapiro
Stephen J. Turnbull

gmail marks mailman confirmation mail as spam...

Kārlis Repsons

Hi, maybe you have some recipe for making gmail treat confirmation mails as non-spam? It just throws mail "confirm e8492f19d7c336341050..".

Kārlis Repsons

And there is one more thing bothering me: look there: http://www.trikata.com/pipermail/test/2009-June/thread.html same word "nogādāt" was posted in all of the cases, where those terrible characters appear! Maybe you know whats wrong?

Kārlis Repsons

Kārlis Repsons

And there is one more thing bothering me: look there: http://www.trikata.com/pipermail/test/2009-June/thread.html same word "nogādāt" was posted in all of the cases, where those terrible characters appear! Maybe you know whats wrong?

Kārlis Repsons

tags

participants (3)