[capi-sig]Re: Let's change to C API!

30 Jul 2018 · *Hiding*

      Salut Victor!
Victor Stinner schrieb am 30.07.2018 um 02:36:
...
2018-07-29 23:40 GMT+02:00 Stefan Behnel:
...
From Cython's POV, exposing internals is a good thing that helps making
extension modules faster.
I'm fine with Cython wanting to pay the burden of following C API
changes to get best performance. But I would only allow Cython (and
cffi) to use it, not all C extensions ;-)
Technically, I plan to keep access to the full API giving access to C
structures and all low-level stuff, for specific use cases like Cython
and debug tools. But at the end of my roadmap, it will be an opt-in
option rather than the default.
Hopefully Cython exists to hide the ugly C API ;-)
I'll try my best. :)
...
...
*Hiding* internals would already break code, so I
don't see the advantage over just *changing* the internals instead, but
continuing to expose those new internals.
My main motivation to change the C API is to allow to change CPython.
For example, I would like to experiment a specialized implementation
of the list type which would store small integers as a C array of
int8_t, int16_t or int32_t to be more space efficient (I'm not sure
that it would be faster, I'm not really interested have SIMD-like
operations in the stdlib). Currently, PySequence_Fast_ITEMS() and
exposing the PyListObject structure prevent to experiment such
optimization, because PyListObject.ob_items leaks PyObject**. To be
honest, I'm not sure that this specific optimization is worth it, but
I like to give this example since it's easy to explain it.
That specific feature already exists. It's called "array.array". And no,
it's not going to be faster but slower in most cases, since it would
actually require creating a new object on item access, which PyList can
avoid. Cython users stumble over this problem all the time, when they type
their code with C data types without noticing that the values are actually
only used in an object context most of the time and end up getting boxed
into Python int objects all over the place. And then they ask on the Cython
mailing list why their code became slower although they statically typed
everything.
The only cases where such a list/array will be faster is with Python's
builtins, e.g. sum() or any(), once they get specialised for the array
types, and potentially with third party extensions that make use of the
denser memory layout.
But you can also already test and benchmark that by implementing the buffer
interface for these builtins. Then, people can use "array.array" instead of
PyList and get a speed gain. For lists of integers or floats, that is –
which are actually best represented in NumPy arrays as soon as they get
large and thus performance becomes interesting. But one of the reasons why
the "array.array" type is little known is probably exactly the problem that
its usage is inefficient compared to PyList in most cases.
...
I would like to *remove* PyDict_GetItem(),
I just checked how we use it in Cython and it's followed by an INCREF in
almost all cases. Same for *WithError(). One (important) exception: the
function argument parsing code, which collects borrowed references to what
the argument tuple/dict contains, before unboxing them into the signature
target types.
...
but maybe we can provide a
3rd party C library which would reimplement PyDict_GetItem() on top of
the new PyDict_GetItemRef() function which returns a strong reference.
You can't really emulate borrowed references based on owned references.
PyPy's cpyext tried that using weakrefs before, and it was both horribly
slow and inherently unsafe. The main problem is that borrowed references do
not have a lifetime, so whatever you do, you can never be sure when you can
consider them invalidated. That doesn't mean you can't just pass a pointer
to some owned reference around, but it does mean that you can't really do
anything interesting with an object anymore (e.g. push it around in memory,
or discard its object representation and only keep an unboxed internal
representation) as soon as there is a borrowed reference to it.
BTW, Cython has various places where it avoids asking for a (fastest in
CPython) borrowed reference when it compiles in PyPy, and uses an owned
reference instead. Can be done.
...
From the point of view of Red Hat, a Linux vendor, having to support
multiple Python versions is a pain, especially for QA testing.
Currently, the compromise is to only provide one Python version per OS
release. For example, Fedora 28 only supports Python 3.6 even if
Python 3.7 has been released during Fedora 28 lifetime. For Fedora, in
practice, it's fine, since they are release every 6 months. Ubuntu LTS
is supported for 5 years, having an old Python version can be more
annoying. And then there is RHEL which is supported for 10 years (up
to 15 years for extended support). On that scale, Python release
schedule doesn't fit well with RHEL support.
By "supported Python version", I not only mean the /usr/bin/pythonX.Y
binary, but also packages for dozens of Python modules. Fedora 28
provides Python binaries for various Python versions (2.7, 2.7, 3.4,
3.5, 3.6, 3.7 if I recall correctly), but it has only python3-*
modules for Python 3.6.
Supporting 2 Python versions, like 3.6 and 3.7, means to double the
size of the repository, but also double the tests for tha QA team
(each time a new package version is released, usually for bugfixes).
What if you want to support 3 Python versions in parallel, if not
more?
Thing is, I don't really see this change for Cython implemented modules. In
order to pick up new features in newer CPython versions and to cater for
C-API changes at C compile time, they'll always want to be built
specifically against the CPython at hand.
...
... in the meanwhile, macOS is stuck at Python 2.7 :-) macOS users:
how much do like Python 2.7 in 2018?
This is one issue.
But a different issue. Eventually, there will be changes to the C-API that
will strongly suggest or require a rebuild for the C code to adapt.
...
Another issue is the Python binary compiled in debug mode, known as
python-dbg (or python-debug or python-debuginfo). Right now, it's
mostly useless since Linux distributions don't provide two flavors of
Python modules (release and debug modes): you have to recompile
manually in debug mode all your C extensions used by your application.
Good luck with installing build dependencies and handling compilation
errors. Because of that, nobody uses the debug build, whereas it's
super useful to debug C extensions. As a consequence, we (Python
upstream, but also Linux vendors) get bug reports where a C extension
crashed and we are unable to debug it (oh, gc.collect() crashed on an
invalid object, deal with that!).
Granted. This could be improved by providing a "debug light" version of
CPython that keeps the C-API unchanged but enables various assertions
internally. Then the macros wouldn't be just as helpful for extensions
anymore, but at least CPython itself could catch misuses of its C-API.
On a somewhat related note: Cython has a "refnanny" to check its own
internal reference counting. It keeps track of all references and
refcounting operations inside of a given function, and then reports any
mismatches at function exit. Requires generating code into each function,
so not something that you could easily deploy and switch on on user side,
but certainly a nifty feature for extension authors.
...
Moreover, right now, it's unclear if the C API is designed for CPython
internals or to be used by third party
Agreed. It's a wild mix of both.
For the "debug light" idea, some macros could be changed to enable checks
only for CPython core builds, but not when the header files are used by
third party code.
...
if it should check all
arguments or not. Some functions check a few arguments, some others
don't. For the functions which check arguments: you get a slowdown,
even if your full application is using properly the C API. It's like
running a kind of debug build in production. Would you deploy a C
program compiled with assertions in production once you checked that
your application is bugfix? Why should we have to pay the price of
this "debug mode" in the Python compiled in "release mode"?
Well, it's probably not that bad. These "assertions" wouldn't hurt much in
PGO builds, where the C compiler knows from the profile that they never strike.
...
...
So, from my POV, I'd vote for

allowing C-API changes in each X.Y release

Which kind of changes do you want to do?
You already proposed some. There is a deprecation process for CPython. I
think it would apply nicely also to the C-API.
...
...

requiring a new binary wheel (or rebuild) for each X.Y release

It doesn't solve the issue of being stuck to one Python version per OS release.
I think that problem is solved by (mini-)conda and conda-forge.
Sorry for saying it that bluntly, but it's a distribution issue. Let
distributors deal with it.
...
...

providing a compatibility layer for "removed" C-API functionality

Above, I proposed to require a *library* for that.
Yeah, it would probably have to be a library to provide ABI backwards
compatibility, not just compile time adaptation.
...
But you would only
be able to use such library with a Python runtime which remains fully
compatible with Python 3.7. No specialized list for you in that case!
That's the price of backward compatibility.
Fine with me, although I don't think this has too be entirely black or
white (or green or pink, if you prefer). Even if a library is used to
provide certain removed operations, an extension module could still be
allowed to use certain newer features from CPython 4.x, say. Not something
to waste time on, though. Extensions can be expected to be either updated
completely or not at all.
...
This is also where I would like to allow to have multiple Python
"runtimes" per Python version:

CPython compiled in release mode with backward compatibility: "python3"
CPython compiled in debug mode "python3-dbg"
experimental CPython, maybe faster: "experimental_python3", for
example with specialized list so incompatible with PyDict_GetItem()
and borrowed references

Weren't you the one who said that distributors already hate having to
support a debug binary at all? :o)
...
...

maybe add a warning to the docs of exposed internals that these are more
likely to change than other parts of the C-API

Yes, we have to work on the C API documentation of CPython. Right now,
I'm more at the first step on my roadmap:
"Step 1: Identify Bad C API and list functions that should be modified
or even removed"
A next step would be to start to document which APIs are "bad" in this
CPython documentation. Maybe start by adding a something like
"provisonal deprecation warning", but only in the documentation. Or a
real deprecation, but only in the doc, if we succeed to agree on APIs
that should go away.
I agree that borrowed references are sometimes a bit annoying, but I also
think that you would be surprised on what really proves a bad API (and what
doesn't) once we have a different implementation for certain things in
CPython, as long as the main goal is performance.
...
...
I'd also suggest to make Cython, pybind11 and cffi (maybe a few more) the
preferred and official ways to extend and integrate with CPython, to keep
those three up to date with all C-API changes, and to make it as easy as
possible for users to build their code with them against new CPython releases.
I'm now really worried about new C extensions which already use
"modern" solutions like Cython and cffi.
I hope you meant "*not* really worried". :)
...
My concern is the very long tail of C extensions which call directly
the C API. I'm sure that we can enhance the C API somehow without
breaking this long tail.
...
If you want a more radical proposal, I'd deprecate the C-API documentation,
push people into not caring about the C-API themselves, and then
concentrate on keeping the major code integration tools out there
compatible and fast with whatever CPython can provide as "exposed internals".
Honestly, at this point, I'm open to any idea! But I'm not ok to
"break the world". This plan is not going to work. Even if PyPy is
promoting cffi for years, the C API remains very popular and commonly
used.
I'm not sure that deprecating the API or the documentation would help.
In 2018, ten years after Python 3.0 has been released, we are still
discussing how to migrate old code base away from Python 2, even if
they are many tools doing "most" of the migration. I'm not even aware
of tools to rewrite a C extension using Cython or cffi. If it exists,
why would anyone take the risk of a regression since C extensions are
currently working perfectly on CPython?
My problem is to find a solution to change the C API without forcing C
extension authors to change their code "too much", maybe using a new
compatibility layers.
I agree that that seems the way to go.
Stefan