[Mailman-Developers] Ham, mailing lists, and oddball character sets

Philip A. Prindeville philipp_subx at redfish-solutions.com
Tue May 4 20:32:42 CEST 2010


Hi.

I work a bit in OSS and contribute to mimedefang and spamassassin, and
myself am on about 70 mailing lists.

I notice that some of these lists that are open to outside mailings are
magnets for spam.

As a consequence, the end-recipients of these lists can't "trust" mail
coming from these lists, and if we filter it (and reject it), we run the
risk of being auto-unsubscribed for too many delivery failures...

It's a thorny issue.

In a perfect world, all MUA's would comply with the following
recommendation:

RFC 2046, last paragraph of section 4.1.2:

   In general, composition software should always use the "lowest common
   denominator" character set possible.  For example, if a body contains
   only US-ASCII characters, it SHOULD be marked as being in the US-
   ASCII character set, not ISO-8859-1, which, like all the ISO-8859
   family of character sets, is a superset of US-ASCII.  More generally,
   if a widely-used character set is a subset of another character set,
   and a body contains only characters in the widely-used subset, it
   should be labelled as being in that subset.  This will increase the
   chances that the recipient will be able to view the resulting entity
   correctly.



And thereby, it would be trivial to bounce a message sent to an
English-language only mailing list that wasn't encoded in USASCII or
Latin1 (iso-8859-1) as the charset.

But alas they don't.

So end-users mail systems end up having to do this, which creates all
sorts of backscatter to the mailing list, etc.

What if mailing list exploders did the following?

When you receive a message that has text/plain parts that aren't
"charset=usascii" or "charset=latin1" attempt to transcode the parts
into one of these (in that order, until success).

If the transcoding fails, reject the message.

Otherwise, substitute the rewritten parts in the forwarded message.

Yes, I know that it's not a good thing to rewrite messages...  but most
mailing lists do a fair amount of message munging anyway (to the point
that PGP becomes useless, for instance).

What do you all think?

It doesn't have to be the default behavior...  But it would definitely
be handy to be an option.

Thanks,

-Philip



More information about the Mailman-Developers mailing list