[Python-checkins] bpo-34002: Minor efficiency and clarity improvements in email package. (GH-7999)
Abhilash Raj
webhook-mailer at python.org
Thu Sep 19 23:26:00 EDT 2019
https://github.com/python/cpython/commit/2702638eabe5f7b25f36d295f0ad78cb8d4eda05
commit: 2702638eabe5f7b25f36d295f0ad78cb8d4eda05
branch: master
author: Michael Selik <mike at selik.org>
committer: Abhilash Raj <maxking at users.noreply.github.com>
date: 2019-09-19T20:25:55-07:00
summary:
bpo-34002: Minor efficiency and clarity improvements in email package. (GH-7999)
* Check intersection of two sets explicitly
Comparing ``len(a) > ``len(a - b)`` is essentially looking for an
intersection between the two sets. If set ``b`` does not intersect ``a``
then ``len(a - b)`` will be equal to ``len(a)``. This logic is more
clearly expressed as ``a & b``.
* Change while/pop to a for-loop
Copying the list, then repeatedly popping the first element was
unnecessarily slow. I also cleaned up a couple other inefficiencies.
There's no need to unpack a tuple, then re-pack and append it. The list
can be created with the first element instead of empty. Secondly, the
``endswith`` method returns a bool, so there's no need for an if-
statement to set ``encoding`` to True or False.
* Use set.intersection to check for intersections
``a.intersection(b)`` method is more clear of purpose than ``not
a.isdisjoint(b)`` and avoids an unnecessary set construction that ``a &
set(b)`` performs.
* Use not isdisjoint instead of intersection
While it reads slightly worse, the isdisjoint method will stop when it
finds a counterexample and returns a bool, rather than looping over the
entire iterable and constructing a new set.
files:
A Misc/NEWS.d/next/Library/2019-09-19-19-58-33.bpo-34002.KBnaVX.rst
M Lib/email/headerregistry.py
M Lib/email/utils.py
diff --git a/Lib/email/headerregistry.py b/Lib/email/headerregistry.py
index dcc960b2cdc1..cc1d19129181 100644
--- a/Lib/email/headerregistry.py
+++ b/Lib/email/headerregistry.py
@@ -69,11 +69,9 @@ def addr_spec(self):
"""The addr_spec (username at domain) portion of the address, quoted
according to RFC 5322 rules, but with no Content Transfer Encoding.
"""
- nameset = set(self.username)
- if len(nameset) > len(nameset-parser.DOT_ATOM_ENDS):
- lp = parser.quote_string(self.username)
- else:
- lp = self.username
+ lp = self.username
+ if not parser.DOT_ATOM_ENDS.isdisjoint(lp):
+ lp = parser.quote_string(lp)
if self.domain:
return lp + '@' + self.domain
if not lp:
@@ -86,11 +84,9 @@ def __repr__(self):
self.display_name, self.username, self.domain)
def __str__(self):
- nameset = set(self.display_name)
- if len(nameset) > len(nameset-parser.SPECIALS):
- disp = parser.quote_string(self.display_name)
- else:
- disp = self.display_name
+ disp = self.display_name
+ if not parser.SPECIALS.isdisjoint(disp):
+ disp = parser.quote_string(disp)
if disp:
addr_spec = '' if self.addr_spec=='<>' else self.addr_spec
return "{} <{}>".format(disp, addr_spec)
@@ -141,10 +137,8 @@ def __str__(self):
if self.display_name is None and len(self.addresses)==1:
return str(self.addresses[0])
disp = self.display_name
- if disp is not None:
- nameset = set(disp)
- if len(nameset) > len(nameset-parser.SPECIALS):
- disp = parser.quote_string(disp)
+ if disp is not None and not parser.SPECIALS.isdisjoint(disp):
+ disp = parser.quote_string(disp)
adrstr = ", ".join(str(x) for x in self.addresses)
adrstr = ' ' + adrstr if adrstr else adrstr
return "{}:{};".format(disp, adrstr)
diff --git a/Lib/email/utils.py b/Lib/email/utils.py
index 858f620e25bf..b137ce3973a4 100644
--- a/Lib/email/utils.py
+++ b/Lib/email/utils.py
@@ -259,21 +259,13 @@ def decode_params(params):
params is a sequence of 2-tuples containing (param name, string value).
"""
- # Copy params so we don't mess with the original
- params = params[:]
- new_params = []
+ new_params = [params[0]]
# Map parameter's name to a list of continuations. The values are a
# 3-tuple of the continuation number, the string value, and a flag
# specifying whether a particular segment is %-encoded.
rfc2231_params = {}
- name, value = params.pop(0)
- new_params.append((name, value))
- while params:
- name, value = params.pop(0)
- if name.endswith('*'):
- encoded = True
- else:
- encoded = False
+ for name, value in params[1:]:
+ encoded = name.endswith('*')
value = unquote(value)
mo = rfc2231_continuation.match(name)
if mo:
diff --git a/Misc/NEWS.d/next/Library/2019-09-19-19-58-33.bpo-34002.KBnaVX.rst b/Misc/NEWS.d/next/Library/2019-09-19-19-58-33.bpo-34002.KBnaVX.rst
new file mode 100644
index 000000000000..5d4c231c74c6
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2019-09-19-19-58-33.bpo-34002.KBnaVX.rst
@@ -0,0 +1,2 @@
+Improve efficiency in parts of email package by changing while-pop to a for
+loop, using isdisjoint instead of set intersections.
More information about the Python-checkins
mailing list