[Python-checkins] cpython (2.7): Issue #17043: The unicode-internal decoder no longer read past the end of

serhiy.storchaka python-checkins at python.org
Thu Feb 7 15:30:48 CET 2013


http://hg.python.org/cpython/rev/498b54e0e856
changeset:   82050:498b54e0e856
branch:      2.7
parent:      82046:f7cc6fbd7ae1
user:        Serhiy Storchaka <storchaka at gmail.com>
date:        Thu Feb 07 16:23:11 2013 +0200
summary:
  Issue #17043: The unicode-internal decoder no longer read past the end of
input buffer.

files:
  Misc/NEWS               |   3 +
  Objects/unicodeobject.c |  51 +++++++++++++---------------
  2 files changed, 27 insertions(+), 27 deletions(-)


diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -9,6 +9,9 @@
 Core and Builtins
 -----------------
 
+- Issue #17043: The unicode-internal decoder no longer read past the end of
+  input buffer.
+
 - Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder.
 
 - Issue #10156: In the interpreter's initialization phase, unicode globals
diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
--- a/Objects/unicodeobject.c
+++ b/Objects/unicodeobject.c
@@ -3376,37 +3376,34 @@
     end = s + size;
 
     while (s < end) {
+        if (end-s < Py_UNICODE_SIZE) {
+            endinpos = end-starts;
+            reason = "truncated input";
+            goto error;
+        }
         memcpy(p, s, sizeof(Py_UNICODE));
+#ifdef Py_UNICODE_WIDE
         /* We have to sanity check the raw data, otherwise doom looms for
            some malformed UCS-4 data. */
-        if (
-#ifdef Py_UNICODE_WIDE
-            *p > unimax || *p < 0 ||
+        if (*p > unimax || *p < 0) {
+            endinpos = s - starts + Py_UNICODE_SIZE;
+            reason = "illegal code point (> 0x10FFFF)";
+            goto error;
+        }
 #endif
-            end-s < Py_UNICODE_SIZE
-            )
-        {
-            startinpos = s - starts;
-            if (end-s < Py_UNICODE_SIZE) {
-                endinpos = end-starts;
-                reason = "truncated input";
-            }
-            else {
-                endinpos = s - starts + Py_UNICODE_SIZE;
-                reason = "illegal code point (> 0x10FFFF)";
-            }
-            outpos = p - PyUnicode_AS_UNICODE(v);
-            if (unicode_decode_call_errorhandler(
-                    errors, &errorHandler,
-                    "unicode_internal", reason,
-                    starts, size, &startinpos, &endinpos, &exc, &s,
-                    &v, &outpos, &p)) {
-                goto onError;
-            }
-        }
-        else {
-            p++;
-            s += Py_UNICODE_SIZE;
+        p++;
+        s += Py_UNICODE_SIZE;
+        continue;
+
+  error:
+        startinpos = s - starts;
+        outpos = p - PyUnicode_AS_UNICODE(v);
+        if (unicode_decode_call_errorhandler(
+                errors, &errorHandler,
+                "unicode_internal", reason,
+                starts, size, &startinpos, &endinpos, &exc, &s,
+                &v, &outpos, &p)) {
+            goto onError;
         }
     }
 

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list