About vulnerabilities in Cpython native code
Hi all, I am currently doing some research on the security of CPython. I used the open source vulnerability analysis engine, Infer(https://fbinfer.com/), to scan the native code of CPython 3.10.0. The scan results show that there are still a number of vulnerabilities in the CPython native code, such as Null dereference, Uninitialized variable, Resource/Memory leak, etc. Moreover, I found that some of the vulnerabilities are related to Python/C API. I enclose the vulnerability report for your reference. Based on the research of the result, I tried to design a tool to automatically detect and repair vulnerabilities in CPython and make this tool available. See: https://github.com/PVMPATCH/PVMPatch Python is my favourite programming language. I sincerely hope that I can help Python become stronger and safer. I hope this discovery can be useful for you to develop Python in the future. Thank you for your time and consideration! Lin
This is also at https://bugs.python.org/issue46280. Please direct comments there. Eric On 1/6/2022 8:22 AM, lxr1210--- via Python-Dev wrote:
Hi all,
I am currently doing some research on the security of CPython. I used the open source vulnerability analysis engine, Infer(https://fbinfer.com/), to scan the native code of CPython 3.10.0.
The scan results show that there are still a number of vulnerabilities in the CPython native code, such as Null dereference, Uninitialized variable, Resource/Memory leak, etc. Moreover, I found that some of the vulnerabilities are related to Python/C API. I enclose the vulnerability report for your reference.
Based on the research of the result, I tried to design a tool to automatically detect and repair vulnerabilities in CPython and make this tool available. See:
https://github.com/PVMPATCH/PVMPatch
Python is my favourite programming language. I sincerely hope that I can help Python become stronger and safer. I hope this discovery can be useful for you to develop Python in the future.
Thank you for your time and consideration!
Lin
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WQ2TVXPW... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Jan 7, 2022 at 1:59 AM lxr1210--- via Python-Dev <python-dev@python.org> wrote:
Hi all,
I am currently doing some research on the security of CPython. I used the open source vulnerability analysis engine, Infer(https://fbinfer.com/), to scan the native code of CPython 3.10.0.
The scan results show that there are still a number of vulnerabilities in the CPython native code, such as Null dereference, Uninitialized variable, Resource/Memory leak, etc. Moreover, I found that some of the vulnerabilities are related to Python/C API. I enclose the vulnerability report for your reference.
Tool needs some improvements. Py_CLEAR is documented as doing nothing if given a null pointer. All of the "value not used" complaints seem to be places where something is coded in a consistent way, such as repeatedly incrementing a value and comparing it to something, so it would be a code maintenance hassle to do things differently on the last one. I checked a few of the null dereference complaints. The tool seems concerned that PyUnicode_DATA(str) might return NULL, that assert(state != NULL) won't stop the function, and that PyThreadState_Get might return NULL (it's documented as bombing with a fatal error in such a situation, so you specifically don't have to check). The complaint about unicodedata.c line 168 is a bit subtler, but the tool isn't able to recognize that rc is always initialized (either have_old will be set and rc is set, or if have_old isn't set, then rc will be set). It would make validation a LOT easier if the complaints could be grouped. ChrisA
On 06. 01. 22 14:22, lxr1210--- via Python-Dev wrote:
Hi all,
I am currently doing some research on the security of CPython. I used the open source vulnerability analysis engine, Infer(https://fbinfer.com/), to scan the native code of CPython 3.10.0.
The scan results show that there are still a number of vulnerabilities in the CPython native code, such as Null dereference, Uninitialized variable, Resource/Memory leak, etc. Moreover, I found that some of the vulnerabilities are related to Python/C API. I enclose the vulnerability report for your reference.
The first page of these looks like false positives (except one might be a flaw in test code). But that's par for the course. I've spent a lot of time digging through reports like these. Sometimes there's a bug worth fixing, sometimes it's even an actual vulnerability, but in my experience, most of what tools find in CPython is not actionable. If you do find a security vulnerability, consider reporting it privately to the security team: see https://www.python.org/dev/security/
Based on the research of the result, I tried to design a tool to automatically detect and repair vulnerabilities in CPython and make this tool available. See:
https://github.com/PVMPATCH/PVMPatch
Python is my favourite programming language. I sincerely hope that I can help Python become stronger and safer. I hope this discovery can be useful for you to develop Python in the future.
Thank you for your time and consideration!
On 06/01/2022 15:21, Petr Viktorin wrote:
Sometimes there's a bug worth fixing, sometimes it's even an actual vulnerability, but in my experience, most of what tools find in CPython is not actionable.
If you do find a security vulnerability, consider reporting it privately to the security team: see https://www.python.org/dev/security/
And Python is not like JavaScript (in the browser), where code is supposed to be run in a total sandbox. Python is not supposed to be a completely memory-safe language. You can always access memory manually using `ctypes`, or, ultimately, `/proc/self/mem`. For this reason, a buffer overflow in CPython is a bug because it can cause a crash, not because it can cause a security vulnerability.
Patrick Reader writes:
And Python is not like JavaScript (in the browser), where code is supposed to be run in a total sandbox. Python is not supposed to be a completely memory-safe language. You can always access memory manually using `ctypes`, or, ultimately, `/proc/self/mem`.
True enough, but
For this reason, a buffer overflow in CPython is a bug because it can cause a crash, not because it can cause a security vulnerability.
A crash *is* a (potential) security vulnerability. If it can be reliably triggered by user input, it's a denial of service. Steve
On Fri, Jan 7, 2022 at 2:57 PM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Patrick Reader writes:
And Python is not like JavaScript (in the browser), where code is supposed to be run in a total sandbox. Python is not supposed to be a completely memory-safe language. You can always access memory manually using `ctypes`, or, ultimately, `/proc/self/mem`.
True enough, but
For this reason, a buffer overflow in CPython is a bug because it can cause a crash, not because it can cause a security vulnerability.
A crash *is* a (potential) security vulnerability. If it can be reliably triggered by user input, it's a denial of service.
Python source code is not user input though. So there has to be a way for someone to attack a Python-based service, like attacking a web app by sending HTTP requests to it. ChrisA
Chris Angelico writes:
Python source code is not user input though. So there has to be a way for someone to attack a Python-based service, like attacking a web app by sending HTTP requests to it.
Not sure what your point is. Of course there has to be a vector. But as a Mailman developer, I can assure you that there are Python programs facing the web that accept HTTP requests and SMTP messages, and process the content, which could be anything an attacker wants it to be. I can't recall any CVEs that we could trace to Python (rather than our code :-/), but Mailman can be and has been attacked. I can imagine that if there was an RCE vulnerability in Python or a C module we use, Mailman would be a top candidate for a workable exploit because of the amount of processing of user-supplied text we must do. (Don't worry about me, I sleep well anyway. Python is pretty bullet-proof IMO ;-) Did I completely misunderstand you, or the previous posters? Steve
On Fri, Jan 7, 2022 at 6:09 PM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Chris Angelico writes:
Python source code is not user input though. So there has to be a way for someone to attack a Python-based service, like attacking a web app by sending HTTP requests to it.
Not sure what your point is. Of course there has to be a vector. But as a Mailman developer, I can assure you that there are Python programs facing the web that accept HTTP requests and SMTP messages, and process the content, which could be anything an attacker wants it to be.
I can't recall any CVEs that we could trace to Python (rather than our code :-/), but Mailman can be and has been attacked. I can imagine that if there was an RCE vulnerability in Python or a C module we use, Mailman would be a top candidate for a workable exploit because of the amount of processing of user-supplied text we must do. (Don't worry about me, I sleep well anyway. Python is pretty bullet-proof IMO ;-)
Did I completely misunderstand you, or the previous posters?
Not completely, just very minorly. I'm distinguishing between attacks that can be triggered remotely, and those which require the attacker to run specific Python code. For example, using ctypes to change the value of an integer object is not an attack vector, because there's no way for an HTTP or SMTP message to cause you to do that. There are *plenty* of ways to abuse ctypes to crash CPython, and we're not afraid of them, because we don't do that kind of thing in public-facing code. :) (If there is a way for an attacker to run arbitrary Python code (maybe by abusing a templating system), then that is its own attack vector, since anything can be done, without any sort of interpreter crash.) My distinction here is that the source code for Mailman itself is not "user input" any more than the source code for CPython is. ChrisA
Chris Angelico writes:
Not completely, just very minorly. I'm distinguishing between attacks that can be triggered remotely, and those which require the attacker to run specific Python code. For example, using ctypes
OK. AFAICT that was a red herring introduced to the thread solely to support the claim "Python isn't memory-safe [anyway]" so it's not reasonable to claim a Python bug is a vulnerability. The original post didn't depend on ctypes or anything like that; it claimed there *might* be vulnerabilities in CPython's C code. If so, my claim is that they would indeed be security-relevant, regardless of what users with access to Python source might or might not be doing. Steve
On Sun, Jan 9, 2022 at 7:35 PM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Chris Angelico writes:
Not completely, just very minorly. I'm distinguishing between attacks that can be triggered remotely, and those which require the attacker to run specific Python code. For example, using ctypes
OK. AFAICT that was a red herring introduced to the thread solely to support the claim "Python isn't memory-safe [anyway]" so it's not reasonable to claim a Python bug is a vulnerability. The original post didn't depend on ctypes or anything like that; it claimed there *might* be vulnerabilities in CPython's C code. If so, my claim is that they would indeed be security-relevant, regardless of what users with access to Python source might or might not be doing.
That's entirely possible, but I eyeballed a number of the examples cited, and they weren't (you can't use an HTTP request to trigger test code, as far as I know). For any of these to be viable issues, they would have to be triggered somehow, and in many cases, it's far from obvious how you might do that. The problem is that this is a single monster report with a huge number of uninteresting concerns (the "value written to but never read" ones), interspersed with a number of complaints which aren't actually issues (like calling Py_CLEAR with a potentially null pointer, which is perfectly safe). It's hard to know whether there are any real issues, without spending a lot of time weeding through the nonissues. ChrisA
participants (6)
-
Chris Angelico
-
Eric V. Smith
-
lxr1210@mail.ustc.edu.cn
-
Patrick Reader
-
Petr Viktorin
-
Stephen J. Turnbull