[Mailman-Users] A scrubber issue

Mark Sapiro msapiro at value.net
Sun Dec 10 21:30:15 CET 2006


Todd Zullinger wrote:
>
>I may be completely missing something then, but wouldn't a text
>attachment be just another MIME part?  Wouldn't that MIME part follow
>RFC 2045?  So section 5.2 of the RFC would apply to those parts and
>lacking a content-type header (header as used in the RFC applies to
>the main mail header and those used in body parts, AIUI) are to be
>assumed as text/plain; charset=us-ascii.


I think this is correct, but...


>It may be that there are many MUA's that don't do this correctly and
>thus Mailman can't afford to make the "correct" assumption according
>to the RFC, but those sending MUA's would be doing the wrong thing as
>far as I can see, no?


Yes they would, but it's not that simple. For example, the MUA I'm
using to compose this message allows me to attach files. If I attach a
file to a message named example.txt, this MUA (notable for its
simplicity and small footprint) will base64 encode the file and add
part headers

Content-Type: text/plain;
	name="example.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
	filename="example.txt"

Note that it doesn't specify a charset= parameter on the Content-Type:
header, and it really can't. Without asking me, the MUA doesn't know
the character set of the data in the file, and I may not know how to
answer if it does ask me. In fact, it's making an assumption to call
it text/plain just because its name ends in '.txt'. Now you might
think this is just one obscure MUA, but Thunderbird does essentially
the same thing:

Content-Type: text/plain;
 name="example.txt"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="example.txt"

as does Microsoft Outlook Express

Content-Type: text/plain;
	format=flowed;
	name="example.txt";
	reply-type=original
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="example.txt"


So there are mainstream MUAs that don't specify the charset of a
text/plain attachment because they don't know what it is. Given this,
Mailman does the safe thing which is to save the contents of the part
as is rather than risking mangling it by trying to coerce it to
us-ascii.


>I plan on contacting the Gnus folks and mentioning that adding a
>content-type header would help to avoid any confusion, but I want to
>be prepared to respond half-intelligently if someone points out that
>the RFC says without one it should be assumed as text/plain and
>us-ascii.


What they really need to do is, as Tokio suggests, fix their mm_cfg.py
setting for PUBLIC_ARCHIVE_URL. It appears that what they have is

PUBLIC_ARCHIVE_URL = '/pipermail/'

The Defaults.py setting  which normally doesn't need to be changed is

PUBLIC_ARCHIVE_URL = 'http://%(hostname)s/pipermail/%(listname)s'

This would make the 'URL' of scrubbed attachments be a working,
clickable link in the archives (at least for new posts). If they
really want it to be a relative URL, they should at least have

PUBLIC_ARCHIVE_URL = '/pipermail/%(listname)s'

so the list name is included, but then it won't be clickable in the
archive.

For other ill effects of the current setting, see the archives link on
the page at <http://lists.gnupg.org/mailman/listinfo/gnupg-users>

For an example of how this is supposed to work, see what
scrubber/pipermail does with your PGP signature at
<http://mail.python.org/pipermail/mailman-users/2006-December/054956.html>

-- 
Mark Sapiro <msapiro at value.net>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list