[Email-SIG] email breakage in 3.2 alpha

lutz at rmi.net lutz at rmi.net
Thu Nov 4 13:56:34 CET 2010


Actually, nevermind.  I have to submit the QC book review to O'Reilly
today, so I must assume that the email changes in 3.2 are immutable
at this point.

To accommodate, I made a last minute patch to the book and its examples
package, to special-case the mail sender's workaround code for the fact
that 3.2 now returns str instead of bytes:

..
text = msgobj.get_payload()       # bytes fails in email pkg on text gen
if isinstance(text, bytes):       # payload is bytes in 3.1, str in 3.2 alpha
    text = text.decode('ascii')   # decode to unicode str so text gen works
..

With this, sends work under 3.2 too.  The workaround must still split up 
the base64 data into lines, though, or else emails send with one massive
line which does not play well with many text tools.  Other 3.2 "fixes" 
seem to be compatible with my workarounds so far (at least with the 
limited testing I've been able to do).

In the end, this wasn't a big change on my end, though patching code 
in books just before publication is very error-prone.  The bigger 
issue to me is that Python core developers seems a bit too inclined
to delegate the consequences of their actions to their users.  I don't
mean this personally; it's a group-wide attitude that has escalated in
recent years.  In this case, it was left to me to accommodate recent
changes, or they would have broken a new 3.X book's major example.

More to the point: if a fix made in the name of aesthetics breaks
working code, is it really a fix?  I don't think so, and perhaps 
we'll have to agree to disagree, but I hope this case serves as 
a data point for future email changes.

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


> -----Original Message-----
> From: lutz at rmi.net
> To: "R. David Murray" <rdmurray at bitdance.com>
> Subject: email breakage in 3.2 alpha
> Date: Mon, 01 Nov 2010 17:50:20 -0000
> 
> Hi David,
> 
> (Sending this offlist; feel free to forward as appropriate)
> 
> As promised, I've finally gotten around to testing the big email
> client example in the upcoming Programming Python under 3.2 alpha3.
> Although much still works, as expected the example is now broken
> under 3.2.  So far at least, the only specific breakage I've found
> is on sending emails with non-text attachments.  Obviously, this 
> is a major issue by itself.
> 
> Below is the change, exception, and relevant code for the send
> breakage I've found so far; the full source file is also attached. 
> (See my prior mail for the book examples link; it's on oreilly.com.)
> 
> In short, for 3.1, I had to add code that manually decoded the bytes
> that email in 3.1 left for base64-encoded binary parts.  Because email
> in 3.2 now returns these as str instead of bytes, the workaround 
> decoding step no longer is needed, but also now fails in 3.2.
> 
> This change is an improvement in 3.2, of course; but because it 
> breaks code that ran correctly under 3.1, it's also a regression.
> It would be straightforward for me to special-case the code for 
> 3.1 only; unfortunately I'm no longer able to change the book or
> retarget it for 3.2 (it's in production now, and is already late). 
> Given that this book might be seen by something like 100K readers 
> evaluating 3.X in general, 3.1 compatibility seems a big deal to 
> me.  Leaving readers with the impression that 3.X is not even 
> backwards compatible within its own line is not good.
> 
> So, I see two options here:
> 
> 1) I can patch the source code of the book's examples on the web and
> post an errata prior to publication.  Not difficult, but again, my 
> larger concern is the PR effect of having to note that the book was
> broken by 3.X changes in between the time it was written and the time
> it was printed.  I'd much rather have any patch/errata work delayed 
> until 3.3 if possible per the next option (as is, the book already 
> has to mention more 3.X issues than I'd like).  Even in 3.3+, I 
> think incompatibilities should be an option that must be enabled 
> if they cannot be avoided altogether (as discussed before).
> 
> 2) In cases like this where you've changed email in a way that's
> incompatible with 3.1, unless the 3.1 code never worked at all, 
> you probably should make the original 3.1 behavior the default, 
> and allow 3.2 changes to be enabled as options (e.g., via default 
> method args, top-level settings in the package, command-line args,
> and so on).  In this specific case, even though I agree that 
> returning a base64-encoded part as bytes might seem "wrong", 
> it did work; changing it to always return str now is a regression
> from 3.1 behavior, and breaks code.  Ideally, this would continue
> to return bytes, but could return str in 3.2 if an incompatibility
> switch were set in the package.
> 
> Naturally, other more exotic solutions are plausible too (returning
> a str subclass instance with a no-op .decode() method added, for 
> example); but they're probably too tricky to bother with for a
> temporary compatibility fix.
> 
> I completely understand the desire to fix email issues asap, but 
> these issues were not showstoppers, and could be managed in 3.1.  
> To me, unless code absolutely cannot work with what's present in 
> a release, "fixing" it in a later release also implies potentially
> "breaking" it for current release users, and is not a clear-cut 
> Good Thing.  Let me know what you think; given the broad impact 
> the book may have on 3.X's future, please weight this carefully.
> 
> Thanks,
> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> 
> ================================================================================
> [changed behavior that breaks non-text-attachment sends]
> 
> C:\...>c:\python31\python
> >>> from email.mime.image import MIMEImage
> >>> bytes = open('monkeys.jpg', 'rb').read()
> >>> m = MIMEImage(bytes)
> >>> m.get_payload()[:40]
> b'/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIB'
> 
> C:\...>c:\python32\python
> >>> from email.mime.image import MIMEImage
> >>> bytes = open('monkeys.jpg', 'rb').read()
> >>> m = MIMEImage(bytes)
> >>> m.get_payload()[:40]
> '/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIB'
> 
> [the following also changed and may impact email clents' content 
> type handling, but looks irrelevant to the email package regression
> (mimetypes now scans the Windows registry, which looks a bit iffy)]
> 
> C:\...>c:\python31\python
> >>> from mimetypes import guess_type
> >>> guess_type('monkeys.jpg')
> ('image/jpeg', None)
> 
> C:...>c:\python32\python
> >>> from mimetypes import guess_type
> >>> guess_type('monkeys.jpg')
> ('image/pjpeg', None)
> 
> ================================================================================
> [exception on failure of send with image attachment]
> 
> Adding image/pjpeg
> <class 'AttributeError'>
> 'str' object has no attribute 'decode'
>   File "C:\Examples\PP4E\Gui\Tools\threadtools.py", line 81, in threaded
>     action(*args)             # assume raises exception if fails
>   File "C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py", line 111, in 
> sendMessage
>     bodytextEncoding, attachesEncodings)
>   File "C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py", line 201, in 
> addAttachments
>     data.read(), _subtype=subtype, _encoder=fix_encode_base64)
>   File "c:\Python32\lib\email\mime\image.py", line 46, in __init__
>     _encoder(self)
>   File "C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py", line 39, in 
> fix_encode_base64
>     text  = bytes.decode('ascii')        # decode to unicode str so text gen wor
> ks
> 
> ================================================================================
> [relevant code in C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py]
> 
> def fix_encode_base64(msgobj):
>      """
>      4E: workaround for a genuine bug in Python 3.1 email package that prevents
>      mail text generation for binary parts encoded with base64 or other email 
>      encodings;  the normal email.encoder run by the constructor leaves payload
>      as bytes, even though it's encoded to base64 text form;  this breaks email 
>      text generation which assumes this is text and requires it to be str;  net 
>      effect is that only simple text part emails can be composed in Py 3.1 email
>      package as is - any MIME-encoded binary part cause mail text generation to 
>      fail;  this bug seems likely to go away in a future Python and email package,
>      in which case this should become a no-op;  see Chapter 13 for more details;
>      """
>      linelen = 76  # per MIME standards
>      from email.encoders import encode_base64
> 
>      encode_base64(msgobj)                # what email does normally: leaves bytes
>      bytes = msgobj.get_payload()         # bytes fails in email pkg on text gen
> 39=> text  = bytes.decode('ascii')        # decode to unicode str so text gen works
>      lines = []                           # split into lines, else 1 massive line
>      while text:
>          line, text = text[:linelen], text[linelen:]
>          lines.append(line)
>      msgobj.set_payload('\n'.join(lines))
> 
> 
> class MailSender(MailTool):
>     def sendMessage(self, From, To, Subj, extrahdrs, bodytext, attaches,
>                                       saveMailSeparator=(('=' * 80) + 'PY\n'),
>                                       bodytextEncoding='us-ascii',
>                                       attachesEncodings=None):
>             ...
>             msg = MIMEMultipart()
>             self.addAttachments(msg, bodytext, attaches,
>                                      bodytextEncoding, attachesEncodings)
> 
>     def addAttachments(self, mainmsg, bodytext, attaches,
>                                       bodytextEncoding, attachesEncodings):
>         """
>         format a multipart message with attachments;
>         use Unicode encodings for text parts if passed;
>         """
>         # add main text/plain part
>         msg = MIMEText(bodytext, _charset=bodytextEncoding)
>         mainmsg.attach(msg)
> 
>         # add attachment parts
>         encodings = attachesEncodings or (['us-ascii'] * len(attaches))
>         for (filename, fileencode) in zip(attaches, encodings):
>             ...
>             # guess content type from file extension, ignore encoding
>             contype, encoding = mimetypes.guess_type(filename)
>             if contype is None or encoding is not None:  # no guess, compressed?
>                 contype = 'application/octet-stream'     # use generic default
>             self.trace('Adding ' + contype)
> 
>             # build sub-Message of appropriate kind
>             maintype, subtype = contype.split('/', 1)
>             if maintype == 'text':                       # 4E: text needs encoding
>                 if fix_text_required(fileencode):        # requires str or bytes
>                     data = open(filename, 'r', encoding=fileencode)
>                 else:
>                     data = open(filename, 'rb')
>                 msg = MIMEText(data.read(), _subtype=subtype, _charset=fileencode)
>                 data.close()
> 
>             elif maintype == 'image':
>                 data = open(filename, 'rb')              # 4E: use fix for binaries
>                 msg  = MIMEImage(
> 201=>                  data.read(), _subtype=subtype, _encoder=fix_encode_base64)
>                 data.close()
>             ...
> 
> =======================================================================
> 





More information about the Email-SIG mailing list