On 29/03/2019 01.02, Victor Stinner wrote:
I read quickly the PEP, I'm not sure that I understood it correctly, so here are some early questions more about the usage of the PEP, than its implementation.
This is not sandboxing, as this proposal does not attempt to prevent malicious behavior (though it enables some new options to do so). See the `Why Not A Sandbox`_ section below for further discussion.
I don't understand well the overall security model. If malicious behaviors can still occur, what is the the purpose of auditing? For example, if an audit hook writes events into a local log file, the attacker can easily remove this log file, no?
An attacker may not have permission to mess with the auditing subsystem. For example an attacker may be able to modify an application like a web server or web application. Audit loggers typically run in a different, more protected context. On modern, hardened operation systems root / Adminstrator aren't all powerful, too. They are typically restricted by additional policies like SELinux.
Further more, servers also send auditing data to remote nodes for analysis. Keep in mind that auditing is not primarily about preventing compromises. It's about detecting what, when, who, and how a system was compromised.
Verified Open Hook
Most operating systems have a mechanism to distinguish between files that can be executed and those that can not. For example, this may be an execute bit in the permissions field, a verified hash of the file contents to detect potential code tampering, or file system path restrictions. These are an important security mechanism for preventing execution of data or code that is not approved for a given environment. Currently, Python has no way to integrate with these when launching scripts or importing modules.
In my experience, it doesn't work just because Python has too many functions opening files indirectly or call external C libraries which open files.
I vaguely recall an exploit in my pysandbox project which uses the internal code of Python which displays a traceback... to read the content of an arbitrary file on the disk :-( Game over. I would never expect that there are so many ways to read a file in Python...
The verified open hook is not about sandboxing. It's a mechanism to prevent a class of attacks like directory traversal attacks. On Linux, the open-for-import hook could refuse access to .py and .pyc files that do not have the user.python_code or root.python_code extended file attribute. This verified open hook could have prevent the compromise of wiki.python.org many years ago.
Even when I restricted pysandbox to the bare minimum of the Python language (with no import), multiple exploits have been found. Moreover, at the end, Python just became useful.
More generally, there are a lot of codes in Python which allow arbitrary code injection :-( (Most of them are now fixed, hopefully!)
I did my best to modify as much functions as possible to implement the PEP 446 "Make newly created file descriptors non-inheritable", but I know that *many* functions call directly open() or fopen() and so create inheritable file descriptors. For example, the Python ssl module takes directly filenames and OpenSSL open directly files. It's just one example.
You will never be able to cover all cases.
I agree. Don't draw the wrong conclusion from your statement. PEP 578 adds hooks for auditing, which in return can be used to harden and log an application. Unlike secure sandboxing, it doesn't have to be perfect. Alex Gaynor summed this up in his blog post https://alexgaynor.net/2018/jul/20/worst-truism-in-infosec/
Having a single function which allows to open an arbitrary file without triggering an audit event would defeat the whole purpose of auditing, no? Again, maybe I didn't understand well the overall purpose of the PEP, sorry.
This case can be detected during development and QE phase. You simply have to count the amount of open syscalls and compare it to the amount of open auditing events.
The important performance impact is the case where events are being raised but there are no hooks attached. This is the unavoidable case - once a developer has added audit hooks they have explicitly chosen to trade performance for functionality.
(The Linux kernel uses advance tooling to inject hooks: it has no impact on performances when no hook is used. Machine code of functions is patched to inject a hook. Impressive stuff :-))
Here I expect a small overhead. But the global overhead will be proportional to the number of hooks, no? Maybe it's not significant with the proposed list of events, but it will be more significant with 100 or 1000 events?
I'm not saying that it's a blocker issue, I'm just thinking aloud to make sure that I understood correctly :-)
The performance impact can be remedied and reduced with a simple check. If there is no audit hook installed, it's just a matter of a pointer deref + JNZ.