Re: [Python-Dev] [Python-checkins] cpython (merge 3.2 -> default): Fix out of bounds read in long_new() for empty bytes with an explicit base.
christian.heimes <python-checkins@python.org> wrote:
Fix out of bounds read in long_new() for empty bytes with an explicit base. int(b'', somebase) calls PyLong_FromString() with char* of length 1 but the function accesses the first argument at offset 1. CID 715359
files: Objects/longobject.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/Objects/longobject.c b/Objects/longobject.c --- a/Objects/longobject.c +++ b/Objects/longobject.c @@ -4285,8 +4285,8 @@ string = PyByteArray_AS_STRING(x); else string = PyBytes_AS_STRING(x); - if (strlen(string) != (size_t)size) { - /* We only see this if there's a null byte in x, + if (strlen(string) != (size_t)size || !size) { + /* We only see this if there's a null byte in x or x is empty, x is a bytes or buffer, *and* a base is given. */ PyErr_Format(PyExc_ValueError, "invalid literal for int() with base %d: %R",
This is a false positive: Assumption: string == "" Call: PyLong_FromString("", NULL, (int)base); Now: str == "" Coverity claims an invalid access at str[1]: if (str[0] == '0' && ((base == 16 && (str[1] == 'x' || str[1] == 'X')) || (base == 8 && (str[1] == 'o' || str[1] == 'O')) || (base == 2 && (str[1] == 'b' || str[1] == 'B')))) But str[1] is never accessed due to shortcut evaluation. Coverity appears to have serious problems with shortcut evaluations in many places. Stefan Krah
Am 12.09.2012 16:22, schrieb Stefan Krah:
This is a false positive:
Assumption: string == ""
Call: PyLong_FromString("", NULL, (int)base);
Now: str == ""
Coverity claims an invalid access at str[1]:
if (str[0] == '0' && ((base == 16 && (str[1] == 'x' || str[1] == 'X')) || (base == 8 && (str[1] == 'o' || str[1] == 'O')) || (base == 2 && (str[1] == 'b' || str[1] == 'B'))))
But str[1] is never accessed due to shortcut evaluation.
Coverity appears to have serious problems with shortcut evaluations in many places.
You might be right. But did you notice that there is much more code beyond the large comment block in PyLong_FromString()? There might be other code paths that push str beyond its limit. My change adds an early opt out in an error case and doesn't cause a performance degradation. I'd have no hard feeling if you'd prefer a revert but I'd keep the modification as it causes no harm.
Christian Heimes <lists@cheimes.de> wrote:
Am 12.09.2012 16:22, schrieb Stefan Krah:
This is a false positive:
You might be right. But did you notice that there is much more code beyond the large comment block in PyLong_FromString()? There might be other code paths that push str beyond its limit.
Yes, I understand. My reasoning was different: The str[1] location Coverity pointed out is a false positive. I checked other locations and they seem to be okay, too. Now, because there's so much code my first instinct would be not to touch it unless there's a proven invalid access. This is to avoid subtle behavior changes.
My change adds an early opt out in an error case and doesn't cause a performance degradation. I'd have no hard feeling if you'd prefer a revert but I'd keep the modification as it causes no harm.
As far as I can see, only the error message is affected. Previously:
int(b'', 0) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: ''
Now the fact that base=0 is converted to base=10 is lost:
int(b'', 0) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 0: b''
No big deal of course, but still a change. Stefan Krah
On 9/12/2012 10:22 AM, Stefan Krah wrote:
christian.heimes <python-checkins@python.org> wrote:
Fix out of bounds read in long_new() for empty bytes with an explicit base. int(b'', somebase) calls PyLong_FromString() with char* of length 1
I don't know what happens internally, but such calls raise ValueError: invalid literal for int() with base 16: '' Of course, even if int() traps such calls before calling PyLong_FromString, an extension writer could goof. Does the length 1 come from added \0? By the way, excessively long lines in checkin messages are a nuisance from reading and responding ;-). -- Terry Jan Reedy
Am 12.09.2012 18:14, schrieb Terry Reedy:
On 9/12/2012 10:22 AM, Stefan Krah wrote:
christian.heimes <python-checkins@python.org> wrote:
Fix out of bounds read in long_new() for empty bytes with an explicit base. int(b'', somebase) calls PyLong_FromString() with char* of length 1
I don't know what happens internally, but such calls raise ValueError: invalid literal for int() with base 16: '' Of course, even if int() traps such calls before calling PyLong_FromString, an extension writer could goof.
Does the length 1 come from added \0?
Coverity (a static code analyzing tool) claims that the some code paths may read beyond the end of data when an empty byte string and any base is given. Internally b'' is converted to a null terminated char array (PyBytes_AS_STRING() returns a null terminated char*). My change shortcuts the execution path and immediately raises an exception for the combination of b'' and some base.
By the way, excessively long lines in checkin messages are a nuisance from reading and responding ;-).
Sorry! In the future I'll add more line breaks. :)
participants (3)
-
Christian Heimes
-
Stefan Krah
-
Terry Reedy