[Python-3000] Question about email/generator.py
Guido van Rossum
guido at python.org
Tue Oct 23 21:36:16 CEST 2007
There's an issue in the email package that I can't resolve by myself.
I described it to Barry like this:
> > So in generator.py on line 291, I read:
> >
> > print(part.get_payload(decode=True), file=self)
> >
> > It turns out that part.get_payload(decode=True) returns a bytes
> > object, and printing a bytes object to a text file is not the right
> > thing to do -- in 3.0a1 it silently just prints those bytes, in 3.0a2
> > it will probably print the repr() of the bytes object. Right now, it
> > errors out because I'm removing the encode() method on PyString
> > objects, and print() converts PyBytes to PyString; then the
> > TextIOWrapper.write() method tries to encode its argument.
> >
> > If I change this to (decode=False), all tests in the email package
> > pass. But is this the right fix???
I should note that this was checked in by the time Barry replied, even
though it clearly was the wrong thing to do. Barry replied:
> Maybe. ;) The problem is that this API is either being too smart for
> its own good, or not smart enough. The intent of decode=True is to
> return the original object encoded in the payload. So for example,
> if MIMEImage was used to encode some jpeg, then decode=True should
> return that jpeg.
>
> The problem is that what you really want is something that's content-
> type aware, such that if your main type is some non-text type like
> image/* or audio/* or even application/octet-stream, you will almost
> always want a bytes object back. But text can also be encoded via
> charset and/or transfer-encoding, and (at least in Py2.x), you'd use
> the same method to get the original, unencoded text back. In that
> case, you definitely want the string, since that's the most natural
> API (i.e. you fed it a string object when you created the MIMEText,
> so you want a string on the way back out).
>
> This is yet another corner case where the old API doesn't really fit
> the new bytes/string model correctly, and of course you can
> (rightly!) argue we were sloppy in Py2.x but were able to (mostly)
> get away with it.
>
> In this /specific/ situation, generator.py:291 can only be called
> when the main type is text, so I think it is clearly expecting a
> string, even though .get_payload() will return a bytes there.
>
> Short of redesigning the API, I can think of two options. First, we
> can change .get_payload() to specific return a string when the main
> type is text and decode=True. This is ugly because the return type
> will depend on the content type of the message. OTOH, get_payload()
> is already fairly ugly here because its return type differs based on
> its argument, although I'd like to split this into a
> separate .get_decoded_payload() method.
>
> The other option is to let .get_payload() return bytes in all cases,
> but in generator.py:291, explicitly convert it to a string, probably
> using raw-unicode-escape. Because we know the main type is text
> here, we know that the payload must contain a string. get_payload()
> will return the bytes of the decoded unicode string, so raw-unicode-
> escape should do the right thing. That's ugly too for obvious reasons.
>
> The one thing that doesn't seem right is for decode=False to be used
> because should the payload be an encoded string, it won't get
> correctly decoded. This is part of the DecodedGenerator, which
> honestly is probably not much used outside the test cases. but the
> intent of that generator is clearly to print the decoded text parts
> with the non-text parts stripped and replaced by a placeholder. So I
> think it definitely wants decoded text payloads, otherwise there's
> not much point in the class.
>
> I hope that explains the situation. I'm open to any other idea -- it
> doesn't even have to be better. ;) I see that you made the
> decode=False change in svn, but that's the one solution that doesn't
> seem right.
At this point I (Guido) am really hoping someone will want to "own"
this issue and redesign the API properly...
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-3000
mailing list