[Python-checkins] [3.8] bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099) (GH-27801)

ambv webhook-mailer at python.org
Tue Aug 17 19:46:43 EDT 2021


https://github.com/python/cpython/commit/03dd89d62413c4a92831ed1b36e2ae8983bcb2d4
commit: 03dd89d62413c4a92831ed1b36e2ae8983bcb2d4
branch: 3.8
author: achraf-mer <51244975+achraf-mer at users.noreply.github.com>
committer: ambv <lukasz at langa.pl>
date: 2021-08-18T01:46:37+02:00
summary:

[3.8] bpo-36384: Leading zeros in IPv4 addresses are no longer tolerated (GH-25099) (GH-27801)

Reverts commit e653d4d8e820a7a004ad399530af0135b45db27a and makes
parsing even more strict. Like socket.inet_pton() any leading zero
is now treated as invalid input.

Signed-off-by: Christian Heimes <christian at python.org>

Co-authored-by: Łukasz Langa <lukasz at langa.pl>

files:
A Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst
M Doc/library/ipaddress.rst
M Doc/whatsnew/3.8.rst
M Lib/ipaddress.py
M Lib/test/test_ipaddress.py

diff --git a/Doc/library/ipaddress.rst b/Doc/library/ipaddress.rst
index 2cdfddb02e96c4..5e21d5db2ed9c3 100644
--- a/Doc/library/ipaddress.rst
+++ b/Doc/library/ipaddress.rst
@@ -104,8 +104,7 @@ write code that handles both IP versions correctly.  Address objects are
    1. A string in decimal-dot notation, consisting of four decimal integers in
       the inclusive range 0--255, separated by dots (e.g. ``192.168.0.1``). Each
       integer represents an octet (byte) in the address. Leading zeroes are
-      tolerated only for values less than 8 (as there is no ambiguity
-      between the decimal and octal interpretations of such strings).
+      not tolerated to prevent confusion with octal notation.
    2. An integer that fits into 32 bits.
    3. An integer packed into a :class:`bytes` object of length 4 (most
       significant octet first).
@@ -117,6 +116,18 @@ write code that handles both IP versions correctly.  Address objects are
    >>> ipaddress.IPv4Address(b'\xC0\xA8\x00\x01')
    IPv4Address('192.168.0.1')
 
+
+   .. versionchanged:: 3.8
+
+      Leading zeros are tolerated, even in ambiguous cases that look like
+      octal notation.
+
+   .. versionchanged:: 3.8.12
+
+      Leading zeros are no longer tolerated and are treated as an error.
+      IPv4 address strings are now parsed as strict as glibc
+      :func:`~socket.inet_pton`.
+
    .. attribute:: version
 
       The appropriate version number: ``4`` for IPv4, ``6`` for IPv6.
diff --git a/Doc/whatsnew/3.8.rst b/Doc/whatsnew/3.8.rst
index 109a06e92efb7d..7a3460adb3827f 100644
--- a/Doc/whatsnew/3.8.rst
+++ b/Doc/whatsnew/3.8.rst
@@ -2307,3 +2307,19 @@ URL by the parser in :mod:`urllib.parse` preventing such attacks. The removal
 characters are controlled by a new module level variable
 ``urllib.parse._UNSAFE_URL_BYTES_TO_REMOVE``. (See :issue:`43882`)
 
+
+Notable changes in Python 3.8.12
+================================
+
+Changes in the Python API
+-------------------------
+
+Starting with Python 3.8.12 the :mod:`ipaddress` module no longer accepts
+any leading zeros in IPv4 address strings. Leading zeros are ambiguous and
+interpreted as octal notation by some libraries. For example the legacy
+function :func:`socket.inet_aton` treats leading zeros as octal notatation.
+glibc implementation of modern :func:`~socket.inet_pton` does not accept
+any leading zeros.
+
+(Originally contributed by Christian Heimes in :issue:`36384`, and backported
+to 3.8 by Achraf Merzouki)
diff --git a/Lib/ipaddress.py b/Lib/ipaddress.py
index 28b7b6159a62ef..d351f07a5bd960 100644
--- a/Lib/ipaddress.py
+++ b/Lib/ipaddress.py
@@ -1173,6 +1173,11 @@ def _parse_octet(cls, octet_str):
         if len(octet_str) > 3:
             msg = "At most 3 characters permitted in %r"
             raise ValueError(msg % octet_str)
+        # Handle leading zeros as strict as glibc's inet_pton()
+        # See security bug bpo-36384
+        if octet_str != '0' and octet_str[0] == '0':
+            msg = "Leading zeros are not permitted in %r"
+            raise ValueError(msg % octet_str)
         # Convert to integer (we know digits are legal)
         octet_int = int(octet_str, 10)
         if octet_int > 255:
diff --git a/Lib/test/test_ipaddress.py b/Lib/test/test_ipaddress.py
index 2f1c5b6b6fb9c8..1297b8371d8583 100644
--- a/Lib/test/test_ipaddress.py
+++ b/Lib/test/test_ipaddress.py
@@ -97,10 +97,23 @@ def pickle_test(self, addr):
 class CommonTestMixin_v4(CommonTestMixin):
 
     def test_leading_zeros(self):
-        self.assertInstancesEqual("000.000.000.000", "0.0.0.0")
-        self.assertInstancesEqual("192.168.000.001", "192.168.0.1")
-        self.assertInstancesEqual("016.016.016.016", "16.16.16.16")
-        self.assertInstancesEqual("001.000.008.016", "1.0.8.16")
+        # bpo-36384: no leading zeros to avoid ambiguity with octal notation
+        msg = "Leading zeros are not permitted in '\d+'"
+        addresses = [
+            "000.000.000.000",
+            "192.168.000.001",
+            "016.016.016.016",
+            "192.168.000.001",
+            "001.000.008.016",
+            "01.2.3.40",
+            "1.02.3.40",
+            "1.2.03.40",
+            "1.2.3.040",
+        ]
+        for address in addresses:
+            with self.subTest(address=address):
+                with self.assertAddressError(msg):
+                    self.factory(address)
 
     def test_int(self):
         self.assertInstancesEqual(0, "0.0.0.0")
diff --git a/Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst b/Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst
new file mode 100644
index 00000000000000..f956cde948ec57
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2021-03-30-16-29-51.bpo-36384.sCAmLs.rst
@@ -0,0 +1,6 @@
+:mod:`ipaddress` module no longer accepts any leading zeros in IPv4 address
+strings. Leading zeros are ambiguous and interpreted as octal notation by
+some libraries. For example the legacy function :func:`socket.inet_aton`
+treats leading zeros as octal notatation. glibc implementation of modern
+:func:`~socket.inet_pton` does not accept any leading zeros. For a while
+the :mod:`ipaddress` module used to accept ambiguous leading zeros.



More information about the Python-checkins mailing list