[…]On 10 Apr 2020, at 19:20, Victor Stinner <vstinner@python.org> wrote:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PEP xxx: Modify the C API to hide implementation details
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Abstract
========
* Hide implementation details from the C API to be able to `optimize
CPython`_ and make PyPy more efficient.
* The expectation is that `most C extensions don't rely directly on
CPython internals`_ and so will remain compatible.
* Continue to support old unmodified C extensions by continuing to
provide the fully compatible "regular" CPython runtime.
* Provide a `new optimized CPython runtime`_ using the same CPython code
base: faster but can only import C extensions which don't use
implementation details. Since both CPython runtimes share the same
code base, features implemented in CPython will be available in both
runtimes.
* `Stable ABI`_: Only build a C extension once and use it on multiple
Python runtimes and different versions of the same runtime.
* Better advertise alternative Python runtimes and better communicate on
the differences between the Python language and the Python
implementation (especially CPython).
Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.
API and ABI incompatible changes
--------------------------------
* Make structures opaque: move them to the internal C API.
* Remove functions from the public C API which are tied to CPython
internals. Maybe begin by marking these functions as private (rename
``PyXXX`` to ``_PyXXX``) or move them to the internal C API.
* Ban statically allocated types (by making ``PyTypeObject`` opaque):
enforce usage of ``PyType_FromSpec()``.
Examples of issues to make structures opaque:
* ``PyGC_Head``: https://bugs.python.org/issue40241
* ``PyObject``: https://bugs.python.org/issue39573
* ``PyTypeObject``: https://bugs.python.org/issue40170
* ``PyThreadState``: https://bugs.python.org/issue39573
Another example are ``Py_REFCNT()`` and ``Py_TYPE()`` macros which can
currently be used l-value to modify an object reference count or type.
Python 3.9 has new ``Py_SET_REFCNT()`` and ``Py_SET_TYPE()`` macros
which should be used instead. ``Py_REFCNT()`` and ``Py_TYPE()`` macros
should be converted to static inline functions to prevent their usage as
l-value.
**Backward compatibility:** backward incompatible on purpose. Break the
limited C API and the stable ABI, with the assumption that `Most C
extensions don't rely directly on CPython internals`_ and so will remain
compatible.
CPython specific behavior
=========================
Some C functions and some Python functions have a behavior which is
closely tied to the current CPython implementation.
is operator
-----------
The "x is y" operator is closed tied to how CPython allocates objects
and to ``PyObject*``.
For example, CPython uses singletons for numbers in [-5; 256] range::Truex=1; (x + 1) is 2Falsex=1000; (x + 1) is 1001
Python 3.8 compiler now emits a ``SyntaxWarning`` when the right operand
of the ``is`` and ``is not`` operators is a literal (ex: integer or
string), but don't warn if it is ``None``, ``True``, ``False`` or
``Ellipsis`` singleton (`bpo-34850
<https://bugs.python.org/issue34850>`_). Example::<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?x=1
x is 1
True
Use Cases
=========
Optimize CPython
----------------
The new optimized runtime can implement new optimizations since it only
supports C extension modules which don't access Python internals.
Tagged pointers
...............
`Tagged pointer <https://en.wikipedia.org/wiki/Tagged_pointer>`_.
Avoid ``PyObject`` for small objects (ex: small integers, short Latin-1
strings, None and True/False singletons): store the content directly in
the pointer, with a tag for the object type.