[New-bugs-announce] [issue32330] Email parser creates a message object that can't be flattened

Mark Sapiro report at bugs.python.org
Thu Dec 14 19:25:27 EST 2017

New submission from Mark Sapiro <mark at msapiro.net>:

This is related to https://bugs.python.org/issue27321 but a different exception is thrown for a different reason. This is caused by a defective spam message. I don't actually have the offending message from the wild, but the attached bad_email_2.eml illustrates the problem.

The defect is the message declares the content charset as us-ascii, but the body contains non-ascii. When the message is parsed into an email.message.Message object and the objects as_string() method is called, UnicodeEncodeError is thrown as follows:

>>> import email
>>> with open('bad_email_2.eml', 'rb') as fp:
...     msg = email.message_from_binary_file(fp)
>>> msg.as_string()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/email/message.py", line 159, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "/usr/lib/python3.5/email/generator.py", line 115, in flatten
  File "/usr/lib/python3.5/email/generator.py", line 181, in _write
  File "/usr/lib/python3.5/email/generator.py", line 214, in _dispatch
  File "/usr/lib/python3.5/email/generator.py", line 243, in _handle_text
    msg.set_payload(payload, charset)
  File "/usr/lib/python3.5/email/message.py", line 316, in set_payload
    payload = payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-33: ordinal not in range(128)

components: email
files: bad_email_2.eml
messages: 308353
nosy: barry, msapiro, r.david.murray
priority: normal
severity: normal
status: open
title: Email parser creates a message object that can't be flattened
type: behavior
versions: Python 3.5, Python 3.6
Added file: https://bugs.python.org/file47333/bad_email_2.eml

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list