[Python-checkins] bpo-38332: Catch KeyError from unknown cte in encoded-word. (GH-16503)

Miss Islington (bot) webhook-mailer at python.org
Sat Oct 12 13:03:28 EDT 2019


https://github.com/python/cpython/commit/e540bb546163f108c7c304f2e6865efaa78cd4c2
commit: e540bb546163f108c7c304f2e6865efaa78cd4c2
branch: 3.8
author: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
committer: GitHub <noreply at github.com>
date: 2019-10-12T10:03:24-07:00
summary:

bpo-38332: Catch KeyError from unknown cte in encoded-word. (GH-16503)


KeyError should cause a failure in parsing the encoded word and should be caught and raised as a _InvalidEWError instead.
(cherry picked from commit 65dcc8a8dc41d3453fd6b987073a5f1b30c5c0fd)

Co-authored-by: Andrei Troie <andreitroie90 at gmail.com>

files:
A Misc/NEWS.d/next/Library/2019-10-05-02-07-52.bpo-38332.hwrPN7.rst
M Lib/email/_header_value_parser.py
M Lib/test/test_email/test__encoded_words.py
M Lib/test/test_email/test__header_value_parser.py

diff --git a/Lib/email/_header_value_parser.py b/Lib/email/_header_value_parser.py
index 16c19907d68d5..1668b4a14e9b9 100644
--- a/Lib/email/_header_value_parser.py
+++ b/Lib/email/_header_value_parser.py
@@ -1057,7 +1057,7 @@ def get_encoded_word(value):
     value = ''.join(remainder)
     try:
         text, charset, lang, defects = _ew.decode('=?' + tok + '?=')
-    except ValueError:
+    except (ValueError, KeyError):
         raise _InvalidEwError(
             "encoded word format invalid: '{}'".format(ew.cte))
     ew.charset = charset
diff --git a/Lib/test/test_email/test__encoded_words.py b/Lib/test/test_email/test__encoded_words.py
index 5a59aebba89be..0b8b1de3359aa 100644
--- a/Lib/test/test_email/test__encoded_words.py
+++ b/Lib/test/test_email/test__encoded_words.py
@@ -58,6 +58,8 @@ def test_wrong_format_input_raises(self):
             _ew.decode('=?')
         with self.assertRaises(ValueError):
             _ew.decode('')
+        with self.assertRaises(KeyError):
+            _ew.decode('=?utf-8?X?somevalue?=')
 
     def _test(self, source, result, charset='us-ascii', lang='', defects=[]):
         res, char, l, d = _ew.decode(source)
diff --git a/Lib/test/test_email/test__header_value_parser.py b/Lib/test/test_email/test__header_value_parser.py
index dd33b065c804b..e442c44a2a74d 100644
--- a/Lib/test/test_email/test__header_value_parser.py
+++ b/Lib/test/test_email/test__header_value_parser.py
@@ -89,6 +89,10 @@ def test_get_encoded_word_missing_middle_raises(self):
         with self.assertRaises(errors.HeaderParseError):
             parser.get_encoded_word('=?abc?=')
 
+    def test_get_encoded_word_invalid_cte(self):
+        with self.assertRaises(errors.HeaderParseError):
+            parser.get_encoded_word('=?utf-8?X?somevalue?=')
+
     def test_get_encoded_word_valid_ew(self):
         self._test_get_x(parser.get_encoded_word,
                          '=?us-ascii?q?this_is_a_test?=  bird',
@@ -399,6 +403,14 @@ def test_get_unstructured_invalid_ew(self):
             [],
             '')
 
+    def test_get_unstructured_invalid_ew_cte(self):
+        self._test_get_x(self._get_unst,
+            '=?utf-8?X?=somevalue?=',
+            '=?utf-8?X?=somevalue?=',
+            '=?utf-8?X?=somevalue?=',
+            [],
+            '')
+
     # get_qp_ctext
 
     def test_get_qp_ctext_only(self):
diff --git a/Misc/NEWS.d/next/Library/2019-10-05-02-07-52.bpo-38332.hwrPN7.rst b/Misc/NEWS.d/next/Library/2019-10-05-02-07-52.bpo-38332.hwrPN7.rst
new file mode 100644
index 0000000000000..600c702cf3bbd
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2019-10-05-02-07-52.bpo-38332.hwrPN7.rst
@@ -0,0 +1,3 @@
+Prevent :exc:`KeyError` thrown by :func:`_encoded_words.decode` when given
+an encoded-word with invalid content-type encoding from propagating all the
+way to :func:`email.message.get`.



More information about the Python-checkins mailing list