[ mailman-Bugs-759841 ] Multipart/mixed issues in archives
Bugs item #759841, was opened at 2003-06-24 07:22 Message generated for change (Comment added) made by msapiro You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=759841&group_id=103 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Pipermail Group: 2.1 (stable) Status: Open
Resolution: Fixed Priority: 8 Private: No Submitted By: Pug Bainter (phelim_gervase) Assigned to: Nobody/Anonymous (nobody) Summary: Multipart/mixed issues in archives
Initial Comment: We are having problems with mailing lists that are not being properly stripped down to text content in the archives. When there is multipart/mixed, it doesn't pull the multipart/alternative sections into their appropriate text portions. For example, from content such as the following: ============================================================================== >From ... [...] Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=------------InterScan_NT_MIME_Boundary [...] This is a multi-part message in MIME format. --------------InterScan_NT_MIME_Boundary Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C336A1.2C7564BC" Content-Transfer-Encoding: 7bit ------_=_NextPart_001_01C336A1.2C7564BC Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Kevin has a pending checkin that addresses the minss/maxss issue. =20 [...] ------_=_NextPart_001_01C336A1.2C7564BC Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML xmlns=3D"http://www.w3.org/TR/REC-html40" xmlns:v =3D=20 "urn:schemas-microsoft-com:vml" xmlns:o =3D=20 "urn:schemas-microsoft-com:office:office" xmlns:w =3D=20 "urn:schemas-microsoft-com:office:word" xmlns:x =3D=20 "urn:schemas-microsoft-com:office:excel" xmlns:st1 =3D=20 "urn:schemas-microsoft-com:office:smarttags"><HEAD><TITLE>Message</TITLE>= [...] ============================================================================== I only get the following: ============================================================================== [64bit-compiler-analysis] RE: vpr analysis Syyyy Kyyyyy syyyk at yyy.com Thu Jun 19 14:27:16 CDT 2003 Previous message: [64bit-compiler-analysis] 06-19-03 MSFT 64-Bit C/C++ compiler +improvement discussion Next message: [64bit-compiler-analysis] RE: vpr analysis Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] -------------------------------------------------------------------------------- Skipped content of type multipart/alternative -------------------------------------------------------------------------------- Previous message: [64bit-compiler-analysis] 06-19-03 MSFT 64-Bit C/C++ compiler +improvement discussion Next message: [64bit-compiler-analysis] RE: vpr analysis Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] -------------------------------------------------------------------------------- More information about the 64bit-compiler-analysis mailing list ============================================================================== As you can see, the actual content of the multipart/alternative portion [text/plain and text/html] were completely stripped out instead of being shown a plain text. ----------------------------------------------------------------------
Comment By: Mark Sapiro (msapiro) Date: 2007-11-06 17:46
Message: Logged In: YES user_id=1123998 Originator: NO It turns out this problem has been observed and discussed at great length in December of 2006. See the thread that begins at <http://mail.python.org/pipermail/mailman-users/2006-December/054904.html>. A few fixes were discussed in that thread but never implemented. I have now tested a fix along the lines of that discussion and committed it and it will be in Mailman 2.1.10 (beta release is imminent). ---------------------------------------------------------------------- Comment By: Mark Sapiro (msapiro) Date: 2007-11-06 14:22 Message: Logged In: YES user_id=1123998 Originator: NO You are correct. I was thinking that without the header, the following text would be a preamble, but this is not the case. There does appear to be a problem here, and I will look into it further. The reconstructed message helps alot. Thanks for that. BTW, the problem is not with pipermail. The message is processed by Mailman/Handlers/Scrubber.py and flattened to plain text before pipermail ever sees it. I have verified that the underlying Python email library parses the MIME structure correctly and sees the body as a text/plain part. I have some ideas, but I haven't looked closely enough to be sure. I'll post again when I know more. ---------------------------------------------------------------------- Comment By: Daniel Kahn Gillmor (rekt) Date: 2007-11-06 13:05 Message: Logged In: YES user_id=842404 Originator: NO Just did a bit of digging. It looks like section 5.2 of RFC 2045 suggests that missing content-types should be treated as: Content-type: text/plain; charset=us-ascii While i agree that it would be better for the sending MUA to include an explicit content-type for each mime part (i'm about to file a bug against the MUA), it seems problematic for pipermail to refuse to render such a part at all. ---------------------------------------------------------------------- Comment By: Daniel Kahn Gillmor (rekt) Date: 2007-11-06 12:55 Message: Logged In: YES user_id=842404 Originator: NO Thanks for the response, msapiro. marc.info's raw copy of it looks basically identical to the version of that message that arrived in my inbox, so i'd say it's a correct copy. The RFC822 headers for the raw message were: Return-Path: <openssh-unix-dev-bounces@mindrot.org> To: <openssh-unix-dev@mindrot.org> Subject: Re: scp -t . - possible idea for additional parameter From: Daniel Kahn Gillmor <dkg-openssh.com@fifthhorseman.net> Date: Thu, 11 Oct 2007 12:34:23 -0400 Message-ID: <87y7e9d300.fsf@squeak.fifthhorseman.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1431543891==" When i supply the concatenation of those headers, a blank line, and then the raw message to msglint, the IETF's message validator [0], it outputs: ----------- OK: found part multipart/mixed line 10 OK: preamble 10: OK: found part multipart/signed line 15 OK: preamble 15: OK: found default part text/plain line 18 OK: found part application/pgp-signature line 67 OK: epilogue 86: WARNING: MIME headers should only be 'Content-*'. No meaning will apply to header 'MIME-Version' at line 89 OK: found part text/plain line 93 ----------- So that validator doesn't have any problem with the message (it assumes the part starting at line 18, which is the section you're suggesting is invalid, is text/plain). Is the validator wrong in assuming that? I don't know the relevant specifications well enough to tell myself. Can you show me where it's a requirement that each MIME section have a content-type? Thanks for looking into this. [0] http://www.apps.ietf.org/msglint.html ---------------------------------------------------------------------- Comment By: Mark Sapiro (msapiro) Date: 2007-11-06 12:04 Message: Logged In: YES user_id=1123998 Originator: NO I can't tell for sure, but the message at <http://marc.info/?l=openssh-unix-dev&m=119212056224122&w=2> appears to be malformed. If I go to <http://marc.info/?l=openssh-unix-dev&m=119212056224122&q=raw> to view the alleged raw message, I see at the beginning: --===============1431543891== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" --=-=-= On Thu 2007-10-11 11:00:41 -0400, Larry Becke wrote: ... I expect to see something like: --===============1431543891== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --=-=-= Content-Type: text/plain; charset=... Content-Transfer-Encoding: ... On Thu 2007-10-11 11:00:41 -0400, Larry Becke wrote: ... I.e., I don't see a Content-Type: header for the message body. If it is in fact missing, that would cause Mailman's behavior in this case, but it is the message that is at fault, not Mailman. So the question is whether or not the alleged raw message is in fact a true representation. If it is, then I think it is the sender's MUA that is at fault. ---------------------------------------------------------------------- Comment By: Daniel Kahn Gillmor (rekt) Date: 2007-11-06 10:52 Message: Logged In: YES user_id=842404 Originator: NO This bug (or something very similar to it) seems to still be a problem. Consider the message here: http://marc.info/?l=openssh-unix-dev&m=119212056224122&w=2 and in its pipermail archive: http://lists.mindrot.org/pipermail/openssh-unix-dev/2007-October/025812.html ---------------------------------------------------------------------- Comment By: Joe Pruett (q7joey) Date: 2005-03-17 21:00 Message: Logged In: YES user_id=559223 i just looked at the cvs closer and i see that the patch is on the 2.1 branch, but hasn't made it into the trunk yet. ---------------------------------------------------------------------- Comment By: Joe Pruett (q7joey) Date: 2005-03-17 20:52 Message: Logged In: YES user_id=559223 i just started working on a 2.1.5 system and discovered that this bug was still there. from looking in cvs, it appears to be fixed there (although it seems to reference an unrelated bugid). updating this bug to reflect the cvs update would be nice. ---------------------------------------------------------------------- Comment By: Tokio Kikuchi (tkikuchi) Date: 2003-12-27 17:17 Message: Logged In: YES user_id=67709 The patch by q7joey is merged into my Scrubber.py patch #866238. I hope Barry can integrate it in 2.1.4. ---------------------------------------------------------------------- Comment By: Joe Pruett (q7joey) Date: 2003-09-27 09:48 Message: Logged In: YES user_id=559223 i have a few line patch that seems to make it do what is expected. i can't see how to attach via sourceforge yet, so i'll paste it here: --- /usr/local/src/mailman-2.1.2/Mailman/Handlers/Scrubber.py Fri Feb 7 23:13:50 2003 +++ ./Scrubber.py Sat Sep 27 08:58:46 2003 @@ -286,11 +286,13 @@ # BAW: Martin's original patch suggested we might want to try # generalizing to utf-8, and that's probably a good idea (eventually). text = [] - for part in msg.get_payload(): + for part in msg.walk(): + if part.get_main_type() == 'multipart': + continue # All parts should be scrubbed to text/plain by now. partctype = part.get_content_type() if partctype <> 'text/plain': - text.append(_('Skipped content of type %(partctype)s')) + text.append(_('Skipped content of type %(partctype)s\n')) continue try: t = part.get_payload(decode=1) ---------------------------------------------------------------------- Comment By: Martin RJ. Cleaver (mrjc) Date: 2003-09-27 00:23 Message: Logged In: YES user_id=50125 This fails for many of my users as they habitually attach a photo of themselves in their signatures. They are incredulous at the idea that mailman can't handle it. Thanks ---------------------------------------------------------------------- Comment By: Joe Pruett (q7joey) Date: 2003-09-26 18:26 Message: Logged In: YES user_id=559223 i agree that this should be a high priority issue. a simple message with just multipart/alternative will show up in the archive ok, but if there is any other kind of attachment, then the entire multipart section is skipped and you just get a link for the extra attachment for download/view ability. i haven't started to look at the code (and i'm not a python/mailman person), but i'll report anything i can find. ---------------------------------------------------------------------- Comment By: Martin RJ. Cleaver (mrjc) Date: 2003-09-22 06:34 Message: Logged In: YES user_id=50125 Additionally I think it is appropriate to up the priority on this bug as it causes key functionality to fail. ---------------------------------------------------------------------- Comment By: Martin RJ. Cleaver (mrjc) Date: 2003-09-22 06:26 Message: Logged In: YES user_id=50125 This is causing me real problems! Is there any known workarounds? If I can't fix this I might have to use a different package as presently all my archives are useless! ---------------------------------------------------------------------- Comment By: Pug Bainter (phelim_gervase) Date: 2003-06-24 10:01 Message: Logged In: YES user_id=484284 This appears to be within: def process(mlist, msg, msgdata=None): at around line 276, but I saw no way of making it recurse for multipart/[mixed|alternative] sub-MIME parts. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=100103&aid=759841&group_id=103
participants (1)
-
SourceForge.net