[Python-checkins] Check result of utc_to_seconds and skip fold probe in pure Python (GH-91582)

Sat May 14 10:59:56 EDT 2022

https://github.com/python/cpython/commit/dae3e2fea39067284724a81d9845e21cfac29e5a
commit: dae3e2fea39067284724a81d9845e21cfac29e5a
branch: 3.11
author: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
committer: miss-islington <31488909+miss-islington at users.noreply.github.com>
date: 2022-05-14T07:59:52-07:00
summary:

Check result of utc_to_seconds and skip fold probe in pure Python (GH-91582)


The `utc_to_seconds` call can fail, here's a minimal reproducer on
Linux:

TZ=UTC python -c "from datetime import *; datetime.fromtimestamp(253402300799 + 1)"

The old behavior still raised an error in a similar way, but only
because subsequent calculations happened to fail as well. Better to fail
fast.

This also refactors the tests to split out the `fromtimestamp` and
`utcfromtimestamp` tests, and to get us closer to the actual desired
limits of the functions. As part of this, we also changed the way we
detect platforms where the same limits don't necessarily apply (e.g.
Windows).

As part of refactoring the tests to hit this condition explicitly (even
though the user-facing behvior doesn't change in any way we plan to
guarantee), I noticed that there was a difference in the places that
`datetime.utcfromtimestamp` fails in the C and pure Python versions, which
was fixed by skipping the "probe for fold" logic for UTC specifically —
since UTC doesn't have any folds or gaps, we were never going to find a
fold value anyway. This should prevent some failures in the pure python
`utcfromtimestamp` method on timestamps close to 0001-01-01.

There are two separate news entries for this because one is a
potentially user-facing change, the other is an internal code
correctness change that, if anything, changes some error messages. The
two happen to be coupled because of the test refactoring, but they are
probably best thought of as independent changes.

Fixes GH-91581
(cherry picked from commit 83c0247d47b99f4571e35ea95361436e1d2a61cd)

Co-authored-by: Paul Ganssle <1377457+pganssle at users.noreply.github.com>

files:
A Misc/NEWS.d/next/Library/2022-04-15-13-16-25.gh-issue-91581.9OGsrN.rst
A Misc/NEWS.d/next/Library/2022-05-11-14-34-09.gh-issue-91581.glkou2.rst
M Lib/datetime.py
M Lib/test/datetimetester.py
M Modules/_datetimemodule.c

diff --git a/Lib/datetime.py b/Lib/datetime.py
index afbb6fed2ecb6..00ded32cc3e3c 100644
--- a/Lib/datetime.py
+++ b/Lib/datetime.py
@@ -1754,7 +1754,7 @@ def _fromtimestamp(cls, t, utc, tz):
         y, m, d, hh, mm, ss, weekday, jday, dst = converter(t)
         ss = min(ss, 59)    # clamp out leap seconds if the platform has them
         result = cls(y, m, d, hh, mm, ss, us, tz)
-        if tz is None:
+        if tz is None and not utc:
             # As of version 2015f max fold in IANA database is
             # 23 hours at 1969-09-30 13:00:00 in Kwajalein.
             # Let's probe 24 hours in the past to detect a transition:
@@ -1775,7 +1775,7 @@ def _fromtimestamp(cls, t, utc, tz):
                 probe2 = cls(y, m, d, hh, mm, ss, us, tz)
                 if probe2 == result:
                     result._fold = 1
-        else:
+        elif tz is not None:
             result = tz.fromutc(result)
         return result
 
diff --git a/Lib/test/datetimetester.py b/Lib/test/datetimetester.py
index 0495362b3f369..8e4bcc7c9efdc 100644
--- a/Lib/test/datetimetester.py
+++ b/Lib/test/datetimetester.py
@@ -2515,45 +2515,101 @@ def test_microsecond_rounding(self):
             self.assertEqual(t.microsecond, 7812)
 
     def test_timestamp_limits(self):
-        # minimum timestamp
-        min_dt = self.theclass.min.replace(tzinfo=timezone.utc)
+        with self.subTest("minimum UTC"):
+            min_dt = self.theclass.min.replace(tzinfo=timezone.utc)
+            min_ts = min_dt.timestamp()
+
+            # This test assumes that datetime.min == 0000-01-01T00:00:00.00
+            # If that assumption changes, this value can change as well
+            self.assertEqual(min_ts, -62135596800)
+
+        with self.subTest("maximum UTC"):
+            # Zero out microseconds to avoid rounding issues
+            max_dt = self.theclass.max.replace(tzinfo=timezone.utc,
+                                               microsecond=0)
+            max_ts = max_dt.timestamp()
+
+            # This test assumes that datetime.max == 9999-12-31T23:59:59.999999
+            # If that assumption changes, this value can change as well
+            self.assertEqual(max_ts, 253402300799.0)
+
+    def test_fromtimestamp_limits(self):
+        try:
+            self.theclass.fromtimestamp(-2**32 - 1)
+        except (OSError, OverflowError):
+            self.skipTest("Test not valid on this platform")
+
+        # XXX: Replace these with datetime.{min,max}.timestamp() when we solve
+        # the issue with gh-91012
+        min_dt = self.theclass.min + timedelta(days=1)
         min_ts = min_dt.timestamp()
+
+        max_dt = self.theclass.max.replace(microsecond=0)
+        max_ts = ((self.theclass.max - timedelta(hours=23)).timestamp() +
+                  timedelta(hours=22, minutes=59, seconds=59).total_seconds())
+
+        for (test_name, ts, expected) in [
+                ("minimum", min_ts, min_dt),
+                ("maximum", max_ts, max_dt),
+        ]:
+            with self.subTest(test_name, ts=ts, expected=expected):
+                actual = self.theclass.fromtimestamp(ts)
+
+                self.assertEqual(actual, expected)
+
+        # Test error conditions
+        test_cases = [
+            ("Too small by a little", min_ts - timedelta(days=1, hours=12).total_seconds()),
+            ("Too small by a lot", min_ts - timedelta(days=400).total_seconds()),
+            ("Too big by a little", max_ts + timedelta(days=1).total_seconds()),
+            ("Too big by a lot", max_ts + timedelta(days=400).total_seconds()),
+        ]
+
+        for test_name, ts in test_cases:
+            with self.subTest(test_name, ts=ts):
+                with self.assertRaises((ValueError, OverflowError)):
+                    # converting a Python int to C time_t can raise a
+                    # OverflowError, especially on 32-bit platforms.
+                    self.theclass.fromtimestamp(ts)
+
+    def test_utcfromtimestamp_limits(self):
         try:
-            # date 0001-01-01 00:00:00+00:00: timestamp=-62135596800
-            self.assertEqual(self.theclass.fromtimestamp(min_ts, tz=timezone.utc),
-                             min_dt)
-        except (OverflowError, OSError) as exc:
-            # the date 0001-01-01 doesn't fit into 32-bit time_t,
-            # or platform doesn't support such very old date
-            self.skipTest(str(exc))
-
-        # maximum timestamp: set seconds to zero to avoid rounding issues
-        max_dt = self.theclass.max.replace(tzinfo=timezone.utc,
-                                           second=0, microsecond=0)
+            self.theclass.utcfromtimestamp(-2**32 - 1)
+        except (OSError, OverflowError):
+            self.skipTest("Test not valid on this platform")
+
+        min_dt = self.theclass.min.replace(tzinfo=timezone.utc)
+        min_ts = min_dt.timestamp()
+
+        max_dt = self.theclass.max.replace(microsecond=0, tzinfo=timezone.utc)
         max_ts = max_dt.timestamp()
-        # date 9999-12-31 23:59:00+00:00: timestamp 253402300740
-        self.assertEqual(self.theclass.fromtimestamp(max_ts, tz=timezone.utc),
-                         max_dt)
-
-        # number of seconds greater than 1 year: make sure that the new date
-        # is not valid in datetime.datetime limits
-        delta = 3600 * 24 * 400
-
-        # too small
-        ts = min_ts - delta
-        # converting a Python int to C time_t can raise a OverflowError,
-        # especially on 32-bit platforms.
-        with self.assertRaises((ValueError, OverflowError)):
-            self.theclass.fromtimestamp(ts)
-        with self.assertRaises((ValueError, OverflowError)):
-            self.theclass.utcfromtimestamp(ts)
-
-        # too big
-        ts = max_dt.timestamp() + delta
-        with self.assertRaises((ValueError, OverflowError)):
-            self.theclass.fromtimestamp(ts)
-        with self.assertRaises((ValueError, OverflowError)):
-            self.theclass.utcfromtimestamp(ts)
+
+        for (test_name, ts, expected) in [
+                ("minimum", min_ts, min_dt.replace(tzinfo=None)),
+                ("maximum", max_ts, max_dt.replace(tzinfo=None)),
+        ]:
+            with self.subTest(test_name, ts=ts, expected=expected):
+                try:
+                    actual = self.theclass.utcfromtimestamp(ts)
+                except (OSError, OverflowError) as exc:
+                    self.skipTest(str(exc))
+
+                self.assertEqual(actual, expected)
+
+        # Test error conditions
+        test_cases = [
+            ("Too small by a little", min_ts - 1),
+            ("Too small by a lot", min_ts - timedelta(days=400).total_seconds()),
+            ("Too big by a little", max_ts + 1),
+            ("Too big by a lot", max_ts + timedelta(days=400).total_seconds()),
+        ]
+
+        for test_name, ts in test_cases:
+            with self.subTest(test_name, ts=ts):
+                with self.assertRaises((ValueError, OverflowError)):
+                    # converting a Python int to C time_t can raise a
+                    # OverflowError, especially on 32-bit platforms.
+                    self.theclass.utcfromtimestamp(ts)
 
     def test_insane_fromtimestamp(self):
         # It's possible that some platform maps time_t to double,
diff --git a/Misc/NEWS.d/next/Library/2022-04-15-13-16-25.gh-issue-91581.9OGsrN.rst b/Misc/NEWS.d/next/Library/2022-04-15-13-16-25.gh-issue-91581.9OGsrN.rst
new file mode 100644
index 0000000000000..1c3008f425578
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2022-04-15-13-16-25.gh-issue-91581.9OGsrN.rst
@@ -0,0 +1,6 @@
+Remove an unhandled error case in the C implementation of calls to
+:meth:`datetime.fromtimestamp <datetime.datetime.fromtimestamp>` with no time
+zone (i.e. getting a local time from an epoch timestamp). This should have no
+user-facing effect other than giving a possibly more accurate error message
+when called with timestamps that fall on 10000-01-01 in the local time. Patch
+by Paul Ganssle.
diff --git a/Misc/NEWS.d/next/Library/2022-05-11-14-34-09.gh-issue-91581.glkou2.rst b/Misc/NEWS.d/next/Library/2022-05-11-14-34-09.gh-issue-91581.glkou2.rst
new file mode 100644
index 0000000000000..846f57844a675
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2022-05-11-14-34-09.gh-issue-91581.glkou2.rst
@@ -0,0 +1,5 @@
+:meth:`~datetime.datetime.utcfromtimestamp` no longer attempts to resolve
+``fold`` in the pure Python implementation, since the fold is never 1 in UTC.
+In addition to being slightly faster in the common case, this also prevents
+some errors when the timestamp is close to :attr:`datetime.min
+<datetime.datetime.min>`.  Patch by Paul Ganssle.
diff --git a/Modules/_datetimemodule.c b/Modules/_datetimemodule.c
index efb5278038f2f..e0bb4ee602c42 100644
--- a/Modules/_datetimemodule.c
+++ b/Modules/_datetimemodule.c
@@ -5071,6 +5071,10 @@ datetime_from_timet_and_us(PyObject *cls, TM_FUNC f, time_t timet, int us,
 
         result_seconds = utc_to_seconds(year, month, day,
                                         hour, minute, second);
+        if (result_seconds == -1 && PyErr_Occurred()) {
+            return NULL;
+        }
+
         /* Probe max_fold_seconds to detect a fold. */
         probe_seconds = local(epoch + timet - max_fold_seconds);
         if (probe_seconds == -1)