[Python-Dev] PEP 578: Python Runtime Audit Hooks

Fri Mar 29 06:24:41 EDT 2019

On 29/03/2019 01.02, Victor Stinner wrote:
> Hi,
> 
> I read quickly the PEP, I'm not sure that I understood it correctly,
> so here are some early questions more about the usage of the PEP, than
> its implementation.
> 
>> This is not sandboxing, as this proposal does not attempt to prevent
>> malicious behavior (though it enables some new options to do so).
>> See the `Why Not A Sandbox`_ section below for further discussion.
> 
> I don't understand well the overall security model. If malicious
> behaviors can still occur, what is the the purpose of auditing? For
> example, if an audit hook writes events into a local log file, the
> attacker can easily remove this log file, no?

An attacker may not have permission to mess with the auditing subsystem.
For example an attacker may be able to modify an application like a web
server or web application. Audit loggers typically run in a different,
more protected context. On modern, hardened operation systems root /
Adminstrator aren't all powerful, too. They are typically restricted by
additional policies like SELinux.

Further more, servers also send auditing data to remote nodes for
analysis. Keep in mind that auditing is not primarily about preventing
compromises. It's about detecting what, when, who, and how a system was
compromised.

>> Verified Open Hook
>> ------------------
>>
>> Most operating systems have a mechanism to distinguish between files
>> that can be executed and those that can not. For example, this may be an
>> execute bit in the permissions field, a verified hash of the file
>> contents to detect potential code tampering, or file system path
>> restrictions. These are an important security mechanism for preventing
>> execution of data or code that is not approved for a given environment.
>> Currently, Python has no way to integrate with these when launching
>> scripts or importing modules.
> 
> In my experience, it doesn't work just because Python has too many
> functions opening files indirectly or call external C libraries which
> open files.
> 
> I vaguely recall an exploit in my pysandbox project which uses the
> internal code of Python which displays a traceback... to read the
> content of an arbitrary file on the disk :-( Game over. I would never
> expect that there are so many ways to read a file in Python...

The verified open hook is not about sandboxing. It's a mechanism to
prevent a class of attacks like directory traversal attacks. On Linux,
the open-for-import hook could refuse access to .py and .pyc files that
do not have the user.python_code or root.python_code extended file
attribute. This verified open hook could have prevent the compromise of
wiki.python.org many years ago.

> Even when I restricted pysandbox to the bare minimum of the Python
> language (with no import), multiple exploits have been found.
> Moreover, at the end, Python just became useful.
> 
> More generally, there are a lot of codes in Python which allow
> arbitrary code injection :-( (Most of them are now fixed, hopefully!)
> 
> I did my best to modify as much functions as possible to implement the
> PEP 446 "Make newly created file descriptors non-inheritable", but I
> know that *many* functions call directly open() or fopen() and so
> create inheritable file descriptors. For example, the Python ssl
> module takes directly filenames and OpenSSL open directly files. It's
> just one example.
> 
> You will never be able to cover all cases.

I agree. Don't draw the wrong conclusion from your statement. PEP 578
adds hooks for auditing, which in return can be used to harden and log
an application. Unlike secure sandboxing, it doesn't have to be perfect.
Alex Gaynor summed this up in his blog post
https://alexgaynor.net/2018/jul/20/worst-truism-in-infosec/

> Having a single function which allows to open an arbitrary file
> without triggering an audit event would defeat the whole purpose of
> auditing, no? Again, maybe I didn't understand well the overall
> purpose of the PEP, sorry.

This case can be detected during development and QE phase. You simply
have to count the amount of open syscalls and compare it to the amount
of open auditing events.

>> The important performance impact is the case where events are being
>> raised but there are no hooks attached. This is the unavoidable case -
>> once a developer has added audit hooks they have explicitly chosen to
>> trade performance for functionality.
> 
> (The Linux kernel uses advance tooling to inject hooks: it has no
> impact on performances when no hook is used. Machine code of functions
> is patched to inject a hook. Impressive stuff :-))
> 
> Here I expect a small overhead. But the global overhead will be
> proportional to the number of hooks, no? Maybe it's not significant
> with the proposed list of events, but it will be more significant with
> 100 or 1000 events?
> 
> I'm not saying that it's a blocker issue, I'm just thinking aloud to
> make sure that I understood correctly :-)

The performance impact can be remedied and reduced with a simple check.
If there is no audit hook installed, it's just a matter of a pointer
deref + JNZ.

Christian