[Python-checkins] Add a minor `Fraction.__hash__()` optimization (GH-15313)

Tim Peters webhook-mailer at python.org
Fri Aug 16 22:09:32 EDT 2019


https://github.com/python/cpython/commit/29bb227a0ce6d355a2b3e5d6a25872e3702ba9bb
commit: 29bb227a0ce6d355a2b3e5d6a25872e3702ba9bb
branch: master
author: Tim Peters <tim.peters at gmail.com>
committer: GitHub <noreply at github.com>
date: 2019-08-16T21:09:16-05:00
summary:

Add a minor `Fraction.__hash__()` optimization (GH-15313)

* Add a minor `Fraction.__hash__` optimization that got lost in the shuffle.

Document the optimizations.

files:
M Lib/fractions.py

diff --git a/Lib/fractions.py b/Lib/fractions.py
index c922c38e2441..2e7047a81844 100644
--- a/Lib/fractions.py
+++ b/Lib/fractions.py
@@ -564,10 +564,25 @@ def __hash__(self):
         try:
             dinv = pow(self._denominator, -1, _PyHASH_MODULUS)
         except ValueError:
-            # ValueError means there is no modular inverse
+            # ValueError means there is no modular inverse.
             hash_ = _PyHASH_INF
         else:
-            hash_ = hash(abs(self._numerator)) * dinv % _PyHASH_MODULUS
+            # The general algorithm now specifies that the absolute value of
+            # the hash is
+            #    (|N| * dinv) % P
+            # where N is self._numerator and P is _PyHASH_MODULUS.  That's
+            # optimized here in two ways:  first, for a non-negative int i,
+            # hash(i) == i % P, but the int hash implementation doesn't need
+            # to divide, and is faster than doing % P explicitly.  So we do
+            #    hash(|N| * dinv)
+            # instead.  Second, N is unbounded, so its product with dinv may
+            # be arbitrarily expensive to compute.  The final answer is the
+            # same if we use the bounded |N| % P instead, which can again
+            # be done with an int hash() call.  If 0 <= i < P, hash(i) == i,
+            # so this nested hash() call wastes a bit of time making a
+            # redundant copy when |N| < P, but can save an arbitrarily large
+            # amount of computation for large |N|.
+            hash_ = hash(hash(abs(self._numerator)) * dinv)
         result = hash_ if self._numerator >= 0 else -hash_
         return -2 if result == -1 else result
 



More information about the Python-checkins mailing list