On 10 Apr 2020, at 19:20, Victor Stinner <vstinner@python.org> wrote:

[…]


++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PEP xxx: Modify the C API to hide implementation details
++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Abstract
========

* Hide implementation details from the C API to be able to `optimize
 CPython`_ and make PyPy more efficient.
* The expectation is that `most C extensions don't rely directly on
 CPython internals`_ and so will remain compatible.
* Continue to support old unmodified C extensions by continuing to
 provide the fully compatible "regular" CPython runtime.
* Provide a `new optimized CPython runtime`_ using the same CPython code
 base: faster but can only import C extensions which don't use
 implementation details. Since both CPython runtimes share the same
 code base, features implemented in CPython will be available in both
 runtimes.
* `Stable ABI`_: Only build a C extension once and use it on multiple
 Python runtimes and different versions of the same runtime.
* Better advertise alternative Python runtimes and better communicate on
 the differences between the Python language and the Python
 implementation (especially CPython).

Note: Cython and cffi should be preferred to write new C extensions.

I’m too old… I still prefer the CPython ABI over the other two mostly because that’s what I know best but also the reduce dependencies. 

This PEP is about existing C extensions which cannot be rewritten with
Cython.

I’m not sure what this PEP  proposes beyond “lets make the stable ABI the default API” and provide a mechanism to get access to the current API.  I guess the proposal also expands the scope for the stable ABI, some internals that are currently exposed in the stable ABI would no longer be so. 

I’m not  opposed to this as long as it is still possible to use the current API, possibly with clean-ups and correctness fixes, As you write the CPython API has some features that make writing correct code harder, in particular the concept of borrowed references. There’s still good reasons to want be as close to the metal as possible, both to get maximal performance and to accomplish things that aren’t possible using the stable ABI.

[…]

API and ABI incompatible changes
--------------------------------

* Make structures opaque: move them to the internal C API.
* Remove functions from the public C API which are tied to CPython
 internals. Maybe begin by marking these functions as private (rename
 ``PyXXX`` to ``_PyXXX``) or move them to the internal C API.
* Ban statically allocated types (by making ``PyTypeObject`` opaque):
 enforce usage of ``PyType_FromSpec()``.

Examples of issues to make structures opaque:

* ``PyGC_Head``: https://bugs.python.org/issue40241
* ``PyObject``: https://bugs.python.org/issue39573
* ``PyTypeObject``: https://bugs.python.org/issue40170
* ``PyThreadState``: https://bugs.python.org/issue39573

Another example are ``Py_REFCNT()`` and ``Py_TYPE()`` macros which can
currently be used l-value to modify an object reference count or type.
Python 3.9 has new ``Py_SET_REFCNT()`` and ``Py_SET_TYPE()`` macros
which should be used instead. ``Py_REFCNT()`` and ``Py_TYPE()`` macros
should be converted to static inline functions to prevent their usage as
l-value.

**Backward compatibility:** backward incompatible on purpose. Break the
limited C API and the stable ABI, with the assumption that `Most C
extensions don't rely directly on CPython internals`_ and so will remain
compatible.

This is definitely backward incompatible in a way that affects all extensions defining types without using  PyTypeSpec due to having PyObject ad PyTypeObject in the list. I wonder how large a percentage of existing extensions is affected by this.  

Making “PyObject” opaque will also affect the stable ABI because even types defined using the PyTypeSpec API embed a “PyObject” value in the structure defining the instance layout. It is easy enough to change this in a way that preserves source-code compatibility, but I’m  not sure it is possible to avoid breaking the stable ABI. 

BTW. This will require growing the PyTypeSpec ABI a little, there are features you cannot implement using that API for example the buffer protocol. 

[…]


CPython specific behavior
=========================

Some C functions and some Python functions have a behavior which is
closely tied to the current CPython implementation.

is operator
-----------

The "x is y" operator is closed tied to how CPython allocates objects
and to ``PyObject*``.

For example, CPython uses singletons for numbers in [-5; 256] range::

x=1; (x + 1) is 2
   True
x=1000; (x + 1) is 1001
   False

Python 3.8 compiler now emits a ``SyntaxWarning`` when the right operand
of the ``is`` and ``is not`` operators is a literal (ex: integer or
string), but don't warn if it is ``None``, ``True``, ``False`` or
``Ellipsis`` singleton (`bpo-34850
<https://bugs.python.org/issue34850>`_). Example::

x=1
x is 1
   <stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
   True

That’s not really something for the C API, and code relying on the small integer cache is IMHO buggy as it is (even without considering different python implementations). Is this a problem for alternative implementations?  My gut feeling would be that this shouldn’t be a problem, an implementation using tagged pointers for smallish integers could just behave as if all smallish integers are singletons.


[…]


Use Cases
=========

Optimize CPython
----------------

The new optimized runtime can implement new optimizations since it only
supports C extension modules which don't access Python internals.

Tagged pointers
...............

`Tagged pointer <https://en.wikipedia.org/wiki/Tagged_pointer>`_.

Avoid ``PyObject`` for small objects (ex: small integers, short Latin-1
strings, None and True/False singletons): store the content directly in
the pointer, with a tag for the object type.

Isn’t that already possible to do with the current API contract (when ignoring the stable ABI)?  You’re already supposed to use accessor macro’s to access and modify attributes in the PyObject structure, those can be modified to do something else for tagged pointers.  Anyone not using the accessor macro’s would have to adjust, but that’s something that can happend regardless (even if we’re more and more careful not to introduce unnecessary breaking changes).

[…]


Ronald


Twitter / micro.blog: @ronaldoussoren
Blog: https://blog.ronaldoussoren.net/