Mailman 3 PEP: Hide implementation details in the C API - Python-ideas

11 Jul 2017

      Hi,

This is the first draft of a big (?) project to prepare CPython to be
able to "modernize" its implementation. Proposed changes should allow
to make CPython more efficient in the future. The optimizations
themself are out of the scope of the PEP, but some examples are listed
to explain why these changes are needed.

For the background, see also my talk at the previous Python Language
Summit at Pycon US, Portland OR:

"Keeping Python competitive"
https://lwn.net/Articles/723752/#723949

"Python performance", slides (PDF):
https://github.com/haypo/conf/raw/master/2017-PyconUS/summit.pdf

Since this is really the first draft, I didn't assign a PEP number to
it yet. I prefer to wait for a first feedback round.

Victor

PEP: xxx
Title: Hide implementation details in the C API
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>,
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 31-May-2017

Abstract
========

Modify the C API to remove implementation details. Add an opt-in option
to compile C extensions to get the old full API with implementation
details.

The modified C API allows to more easily experiment new optimizations:

* Indirect Reference Counting
* Remove Reference Counting, New Garbage Collector
* Remove the GIL
* Tagged pointers

Reference counting may be emulated in a future implementation for
backward compatibility.

Rationale
=========

History of CPython forks
------------------------

Last 10 years, CPython was forked multiple times to attempt
different CPython enhancements:

* Unladen Swallow: add a JIT compiler based on LLVM
* Pyston: add a JIT compiler based on LLVM (CPython 2.7 fork)
* Pyjion: add a JIT compiler based on Microsoft CLR
* Gilectomy: remove the Global Interpreter Lock nicknamed "GIL"
* etc.

Sadly, none is this project has been merged back into CPython. Unladen
Swallow looses its funding from Google, Pyston looses its funding from
Dropbox, Pyjion is developed in the limited spare time of two Microsoft
employees.

One hard technically issue which blocked these projects to really
unleash their power is the C API of CPython. Many old technical choices
of CPython are hardcoded in this API:

* reference counting
* garbage collector
* C structures like PyObject which contains headers for reference
  counting and the garbage collector
* specific memory allocators
* etc.

PyPy
----

PyPy uses more efficient structures and use a more efficient garbage
collector without reference counting. Thanks to that (but also many
other optimizations), PyPy succeeded to run Python code up to 5x faster
than CPython.

Plan made of multiple small steps
=================================

Step 1: split Include/ into subdirectories
------------------------------------------

Split the ``Include/`` directory of CPython:

* ``python`` API: ``Include/Python.h`` remains the default C API
* ``core`` API: ``Include/core/Python.h`` is a new C API designed for
  building Python
* ``stable`` API: ``Include/stable/Python.h`` is the stable ABI

Expect declarations to be duplicated on purpose: ``#include`` should be
not used to include files from a different API to prevent mistakes. In
the past, too many functions were exposed *by mistake*, especially
symbols exported to the stable ABI by mistake.

At this point, ``Include/Python.h`` is not changed at all: zero risk of
backward incompatibility.

The ``core`` API is the most complete API exposing *all* implementation
details and use macros for best performances.

XXX should we abandon the stable ABI? Never really used by anyone.

Step 2: Add an opt-in API option to tools building packages
-----------------------------------------------------------

Modify Python packaging tools (distutils, setuptools, flit, pip, etc.)
to add an opt-in option to choose the API: ``python``, ``core`` or
``stable``.

For example, debuggers like ``vmprof`` need need the ``core`` API to get
a full access to implementation details.

XXX handle backward compatibility for packaging tools.

Step 3: first pass of implementation detail removal
---------------------------------------------------

Modify the ``python`` API:

* Add a new ``API`` subdirectory in the Python source code which will
  "implement" the Python C API
* Replace macros with functions. The implementation of new functions
  will be written in the ``API/`` directory. For example, Py_INCREF()
  becomes the function ``void Py_INCREF(PyObject *op)`` and its
  implementation will be written in the ``API`` directory.
* Slowly remove more and more implementation details from this API.

Modifications of these API should be driven by tests of popular third
party packages like:

* Django with database drivers
* numpy
* scipy
* Pillow
* lxml
* etc.

Compilation errors on these extensions are expected. This step should
help to draw a line for the backward incompatible change.

Goal: remove a few implementation details but don't break numpy and
lxml.

Step 4
------

Switch the default API to the new restricted ``python`` API.

Help third party projects to patch their code: don't break the "Python
world".

Step 5
------

Continue Step 3: remove even more implementation details.

Long-term goal to complete this PEP: Remove *all* implementation
details, remove all structures and macros.

Alternative: keep core as the default API
=========================================

A smoother transition would be to not touch the existing API but work on
a new API which would only be used as an opt-in option.

Similar plan used by Gilectomy: opt-in option to get best performances.

It would be at least two Python binaries per Python version: default
compatible version, and a new faster but incompatible version.

Idea: implementation of the C API supporting old Python versions?
=================================================================

Open questions.

Q: Would it be possible to design an external library which would work
on Python 2.7, Python 3.4-3.6, and the future Python 3.7?

Q: Should such library be linked to libpythonX.Y? Or even to a pythonX.Y
binary which wasn't built with shared library?

Q: Would it be easy to use it? How would it be downloaded and installed
to build extensions?

Collaboration with PyPy, IronPython, Jython and MicroPython
===========================================================

XXX to be done

Enhancements becoming possible thanks to a new C API
====================================================

Indirect Reference Counting
---------------------------

* Replace ``Py_ssize_t ob_refcnt;`` (integer)
  with ``Py_ssize_t *ob_refcnt;`` (pointer to an integer).
* Same change for GC headers?
* Store all reference counters in a separated memory block
  (or maybe multiple memory blocks)

Expected advantage: smaller memory footprint when using fork() on UNIX
which is implemented with Copy-On-Write on physical memory pages.

See also `Dismissing Python Garbage Collection at Instagram
<https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172>`_.

Remove Reference Counting, New Garbage Collector
------------------------------------------------

If the new C API hides well all implementation details, it becomes
possible to change fundamental features like how CPython tracks the
lifetime of an object.

* Remove ``Py_ssize_t ob_refcnt;`` from the PyObject structure
* Replace the current XXX garbage collector with a new tracing garbage
  collector
* Use new macros to define a variable storing an object and to set the
  value of an object
* Reimplement Py_INCREF() and Py_DECREF() on top of that using a hash
  table: object => reference counter.

XXX PyPy is only partially successful on that project, cpyext remains
very slow.

XXX Would it require an opt-in option to really limit backward
compatibility?

Remove the GIL
--------------

* Don't remove the GIL, but replace the GIL with smaller locks
* Builtin mutable types: list, set, dict
* Modules
* Classes
* etc.

Backward compatibility:

* Keep the GIL

Tagged pointers
---------------

https://en.wikipedia.org/wiki/Tagged_pointer

Common optimization, especially used for "small integers".

Current C API doesn't allow to implement tagged pointers.

Tagged pointers are used in MicroPython to reduce the memory footprint.

Note: ARM64 was recently extended its address space to 48 bits, causing
issue in LuaJIT: `47 bit address space restriction on ARM64
<https://github.com/LuaJIT/LuaJIT/issues/49>`_.

Misc ideas
----------

* Software Transactional Memory?
  See `PyPy STM <http://doc.pypy.org/en/latest/stm.html>`_

Idea: Multiple Python binaries
==============================

Instead of a single ``python3.7``, providing two or more binaries, as
PyPy does, would allow to experiment more easily changes without
breaking the backward compatibility.

For example, ``python3.7`` would remain the default binary with
reference counting and the current garbage collector, whereas
``fastpython3.7`` would not use reference counting and a new garbage
collector.

It would allow to more quickly "break the backward compatibility" and
make it even more explicit than only prepared C extensions will be
compatible with the new ``fastpython3.7``.

cffi
====

XXX

Long term goal: "more cffi, less libpython".

Copyright
=========

This document has been placed in the public domain.

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

PEP: Hide implementation details in the C API

Victor Stinner

Nick Coghlan

Paul Moore

Victor Stinner

Eric Snow

Barry Scott

Nick Coghlan

Terry Reedy

Ronald Oussoren

Brett Cannon

Victor Stinner

Nick Coghlan

Victor Stinner

Nick Coghlan

Ronald Oussoren

Ronald Oussoren

Brett Cannon

Nick Coghlan

Stefan Behnel

Nick Coghlan

Victor Stinner

tags

participants (9)