[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

Victor Stinner vstinner at redhat.com
Tue Nov 13 04:31:35 EST 2018


Le mar. 13 nov. 2018 à 08:13, Gregory P. Smith <greg at krypto.org> a écrit :
> When things have only ever been macros (Py_INCREF, etc) the name can be reused if there has never been a function of that name in an old C API.  But beware of reuse for anything where the semantics change to avoid misunderstandings about behavior from people familiar with the old API or googling API names to look up behavior.

My plan is to only keep an existing function if it has no flaw. If it
has a flaw, it should be removed and maybe replaced with a new
function (or suggest a replacement using existing APIs). I don't want
to modify the behavior depending if it's the "old" or the "new" API.
My plan reuses the same code base, I don't want to put the whole body
of a function inside a "#ifdef NEWCAPI".


> I suspect optimizing for ease of transition from code written to the existing C API to the new API by keeping names the same is the wrong thing to optimize for.

Not all functions in the current C API are bad. Many functions are
just fine. For example, PyObject_GetAttr() returns a strong reference.
I don't see anything wrong with this API. Only a small portion of the
C API is "bad".


> Using entirely new names may actually be a good thing as it makes it immediately clear which way a given piece of code is written. It'd also be good for PyObject* the old C API thing be a different type from PythonHandle* (a new API thing who's name I just made up) such that they could not be passed around and exchanged for one another without a compiler complaint.  Code written using both APIs should not be allowed to transit objects directly between different APIs.

On Windows, the HANDLE type is just an integer, it's not a pointer. If
it's a pointer, some developer may want to dereference it, whereas it
must really be a dummy integer. Consider tagged pointers: you don't
want to dereferenced a tagged pointer. But no, I don't plan to replace
"PyObject*". Again, I want to reduce the number of changes. If the
PyObject structure is not exposed, I don't think that it's an issue to
keep "PyObject*" type.

Example:
---
#include <stddef.h>

typedef struct _object PyObject;

PyObject* dummy(void)
{
    return (PyObject *)NULL;
}

int main()
{
    PyObject *obj = dummy();
    return obj->ob_type;
}
---

This program is valid, except of the single line which attempts to
dereference PyObject*:

x.c: In function 'main':
x.c:13:15: error: dereferencing pointer to incomplete type 'PyObject
{aka struct _object}'
     return obj->ob_type;

If I could restart from scratch, I would design the C API differently.
For example, I'm not sure that I would use "global variables" (Python
thread state) to store the current exception. I would use similar like
Rust error handling:
https://doc.rust-lang.org/book/first-edition/error-handling.html

But that's not my plan. My plan is not to write a new bright world. My
plan is to make a "small step" towards a better API to make PyPy more
efficient and to allow to write a new more optimized CPython.

I also plan to *iterate* on the API rather than having a frozen API.
It's just that we cannot jump towards the perfect API at once. We need
small steps and make sure that we don't break too many C extensions at
each milestone. Maybe the new API should be versioned as Android NDK
for example.

Victor


More information about the Python-Dev mailing list