[Python-checkins] bpo-30835: email: Fix AttributeError when parsing invalid CTE (GH-13598)

Barry Warsaw webhook-mailer at python.org
Tue Jun 4 14:01:05 EDT 2019

commit: aa79707262f893428665ef45b5e879129abca4aa
branch: master
author: Abhilash Raj <maxking at users.noreply.github.com>
committer: Barry Warsaw <barry at python.org>
date: 2019-06-04T11:00:47-07:00

bpo-30835: email: Fix AttributeError when parsing invalid CTE (GH-13598)

* bpo-30835: email: Fix AttributeError when parsing invalid Content-Transfer-Encoding

Parsing an email containing a multipart Content-Type, along with a
Content-Transfer-Encoding containing an invalid (non-ASCII-decodable) byte
will fail. email.feedparser.FeedParser._parsegen() gets the header and
attempts to convert it to lowercase before comparing it with the accepted
encodings, but as the header contains an invalid byte, it's returned as a
Header object rather than a str.

Cast the Content-Transfer-Encoding header to a str to avoid this.

Found using the AFL fuzzer.

Reported-by: Daniel Axtens <dja at axtens.net>
Signed-off-by: Andrew Donnellan <andrew at donnellan.id.au>

* Add email and NEWS entry for the bugfix.

A Misc/NEWS.d/next/Library/2019-05-27-15-29-46.bpo-30835.3FoaWH.rst
M Lib/email/feedparser.py
M Lib/test/test_email/test_email.py

diff --git a/Lib/email/feedparser.py b/Lib/email/feedparser.py
index 7c07ca86457a..97d3f5144d60 100644
--- a/Lib/email/feedparser.py
+++ b/Lib/email/feedparser.py
@@ -320,7 +320,7 @@ def _parsegen(self):
             # Make sure a valid content type was specified per RFC 2045:6.4.
-            if (self._cur.get('content-transfer-encoding', '8bit').lower()
+            if (str(self._cur.get('content-transfer-encoding', '8bit')).lower()
                     not in ('7bit', '8bit', 'binary')):
                 defect = errors.InvalidMultipartContentTransferEncodingDefect()
                 self.policy.handle_defect(self._cur, defect)
diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py
index dfb3be84384a..c29cc56203b1 100644
--- a/Lib/test/test_email/test_email.py
+++ b/Lib/test/test_email/test_email.py
@@ -1466,6 +1466,15 @@ def test_mangled_from_with_bad_bytes(self):
         self.assertEqual(b.getvalue(), source + b'>From R\xc3\xb6lli\n')
+    def test_mutltipart_with_bad_bytes_in_cte(self):
+        # bpo30835
+        source = textwrap.dedent("""\
+            From: aperson at example.com
+            Content-Type: multipart/mixed; boundary="1"
+            Content-Transfer-Encoding: \xc8
+        """).encode('utf-8')
+        msg = email.message_from_bytes(source)
 # Test the basic MIMEAudio class
 class TestMIMEAudio(unittest.TestCase):
diff --git a/Misc/NEWS.d/next/Library/2019-05-27-15-29-46.bpo-30835.3FoaWH.rst b/Misc/NEWS.d/next/Library/2019-05-27-15-29-46.bpo-30835.3FoaWH.rst
new file mode 100644
index 000000000000..019321d6f1d7
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2019-05-27-15-29-46.bpo-30835.3FoaWH.rst
@@ -0,0 +1,3 @@
+Fixed a bug in email parsing where a message with invalid bytes in
+content-transfer-encoding of a multipart message can cause an AttributeError.
+Patch by Andrew Donnellan.

More information about the Python-checkins mailing list