[Email-SIG] Maybe a bug, maybe not

Barry Warsaw barry at python.org
Thu May 13 19:00:14 EDT 2004


I'm finally getting around to replying to this thread...

On Mon, 2004-05-03 at 15:45, Alexandre Ratti wrote:

> [Eric S. Johansson wrote]
> > found a very common form of spam that triggers an exception.

> I suspect that the crash occur because these messages have multipart
> boundaries but have a text content type header. This cause the
> "_handle_text" method of the Generator class (in email/Generator.py) to
> be called. This method expects get_payload() to return a string, which
> doesn't happen since the message is multipart.
> 
> This seems to similar to a know issue:
> 
> http://sourceforge.net/tracker/index.php?func=detail&aid=846938&group_id=5470&atid=105470

I think it's the same issue and I don't believe this is fixed in email
2.5.5 (Python 2.3.4).  I know later on in this thread Skip says it is
fixed, but unless my release23-maint branch is messed up, I don't think
it is.  I honestly don't know if we have the time to get this into a
2.3.4 final, since 1) I probably won't have time to do it, 2) I'm not
certain what the right fix it.

Basically, the parser should not be parsing such messages such that
is_multipart() would return true.  That's not going to happen for email
2.5 so perhaps your workaround is the best we can do.  Note that email
3.0 (Python 2.4) definitely does not suffer from this problem.

> or use this diff (against the 2.5.4 version of the email package):
> 
> --- Generator.orig.py   Mon May  3 20:41:27 2004
> +++ Generator.py        Mon May  3 20:43:46 2004
> @@ -197,7 +197,12 @@
>           if cset is not None:
>               payload = cset.body_encode(payload)
>           if not _isstring(payload):
> -            raise TypeError, 'string payload expected: %s' % type(payload)
> +            # Changed to handle malformed messages with a text base
> +            # type and a multipart content.
> +            if type(payload) == type([]) and msg.is_multipart():
> +                return self._handle_multipart(msg)
> +            else:
> +               raise TypeError, 'string payload expected: %s' %
> type(payload)
>           if self._mangle_from_:
>               payload = fcre.sub('>From ', payload)
>           self._fp.write(payload)
> 
> This change seems to fix the problem. I fed a mailbox with several of
> these messages to spambayes and they were parsed OK and flagged as spam
> as expected.

You you please attach this patch (not cut-n-paste) it to Jason's bug
report:

http://sourceforge.net/tracker/index.php?func=detail&aid=846938&group_id=5470&atid=105470

That's so much better than letting it get buried in this thread!

-Barry

P.S. I don't think you need to test for both type(payload) == type([])
and msg.is_multipart().  Just the latter will do, since that all
is_multipart() does.  Besides, the right way to spell the former (in
Python 2.1-speak) would be isinstance(payload, ListType).





More information about the Email-SIG mailing list