[capi-sig]Re: No C API if it's not doable in Python
On 2018-09-07 14:47, Victor Stinner wrote:
Inside CPython, the core and builtin modules, micro-optimizations must be used: abuse borrowed references, macros, access directly to all fields of C structure, etc.
But here I'm talking about the public C API used by third party extensions.
Making a difference between "inside CPython" and "third party extensions" is a bad idea. Making such a difference would be problematic:
Complexity: we should not have two different C APIs. There should be just one, usable internally and externally.
Performance: if an optimization is important for CPython, then it's also important for third-party extensions. We don't want third-party code to be inherently slower than CPython itself. (this reminds me of PEP 580, regarding several internal optimizations which aren't available to third-party classes)
Jeroen.
On Fri, Sep 7, 2018 at 10:25 PM Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
On 2018-09-07 14:47, Victor Stinner wrote:
Inside CPython, the core and builtin modules, micro-optimizations must be used: abuse borrowed references, macros, access directly to all fields of C structure, etc.
But here I'm talking about the public C API used by third party extensions.
Making a difference between "inside CPython" and "third party extensions" is a bad idea. Making such a difference would be problematic:
I can't agree.
- Complexity: we should not have two different C APIs. There should be just one, usable internally and externally.
Most of these private APIs, we don't have "two different APIs". We have just one private API.
Internal APIs (functions starts with _Py / macros starts with _PY) are just private functions in most case. It is used only for calling over compile unit. If we start all internal API public, we can't keep backward compatibility.
For example, gc module calls "_PyTuple_MaybeUntrack" https://github.com/python/cpython/blob/886483e2b9bbabf60ab769683269b873381dd...
It is highly depending on implementation detail of GC and tuple. Making it public is not make sense.
- Performance: if an optimization is important for CPython, then it's also important for third-party extensions. We don't want third-party code to be inherently slower than CPython itself. (this reminds me of PEP 580, regarding several internal optimizations which aren't available to third-party classes)
Which extension type is important as same as builtin str, tuple, int?
For example, dict implementation have special casing for str key. It's because (1) dict is heavily used for namespace, and (2) key of namespace is very likely str.
In some cases, special casing for builtin types is necessary.
In case of FASTCALL, it is private because it is under development and no one thought it's stable enough to keep backward compatibility in some future versions.
I agree that we should make FASTCALL accessible from third parties in 3.8.
But I think we should have "evolution era" for some important new APIs. And calling it from builtin modules is important for learning the API before making it public, like FASTCALL is evolved between 3.6 and 3.7.
Regards,
-- INADA Naoki <songofacandy@gmail.com>
INADA Naoki schrieb am 07.09.2018 um 18:28:
Most of these private APIs, we don't have "two different APIs". We have just one private API.
Internal APIs (functions starts with _Py / macros starts with _PY) are just private functions in most case.
Which is ok, right? It's obvious from their name that they are not public and thus subject to change at any time (well, probably just with every minor release, though, which should also be an acceptable restriction for CPython).
- Performance: if an optimization is important for CPython, then it's also important for third-party extensions. We don't want third-party code to be inherently slower than CPython itself. (this reminds me of PEP 580, regarding several internal optimizations which aren't available to third-party classes)
Which extension type is important as same as builtin str, tuple, int?
For example, dict implementation have special casing for str key. It's because (1) dict is heavily used for namespace, and (2) key of namespace is very likely str.
And I would *love* to get access to that. Currently, it's a very hidden implementation detail that's impossible to exploit by extension modules. It's really annoying to have this optimisation (and, actually, the whole dict implementation) right in the code on the other side, but not being able to make direct use of it in Cython, e.g. for faster dict creation or iteration.
But I think we should have "evolution era" for some important new APIs. And calling it from builtin modules is important for learning the API before making it public, like FASTCALL is evolved between 3.6 and 3.7.
What we do in Cython when exploiting some CPython internal feature is to a) implement the code that uses it twice to have a safe (and usually slower) fallback for PyPy and friends, and b) hide it behind a feature macro like "CYTHON_USE_UNICODE_INTERNALS" that users can simply set to "0" in their setup.py (or even CFLAGS) if it ever gets in the way for them.
Stefan
I added FASTCALL to Python 3.6. I chose to make it private on purpose, we didn't have enough feedback to know if the whole idea of FASTCALL was a good idea or not. It took one or two years to cleanup the code and the exact API. METH_FASTCALL changed deeply in Python 3.7: it doesn't include keyword arguments anymore.
About Cython: the fact that it was private didn't prevent Cython to use METH_FASTCALL since Python 3.6, and so Cython was broken in Python 3.7. Well, it's a deliberate choice of Cython, and I'm fine with that. I'm also interested to help to keep Cython up to date :-)
Victor Le ven. 7 sept. 2018 à 18:29, INADA Naoki <songofacandy@gmail.com> a écrit :
On Fri, Sep 7, 2018 at 10:25 PM Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
On 2018-09-07 14:47, Victor Stinner wrote:
Inside CPython, the core and builtin modules, micro-optimizations must be used: abuse borrowed references, macros, access directly to all fields of C structure, etc.
But here I'm talking about the public C API used by third party extensions.
Making a difference between "inside CPython" and "third party extensions" is a bad idea. Making such a difference would be problematic:
I can't agree.
- Complexity: we should not have two different C APIs. There should be just one, usable internally and externally.
Most of these private APIs, we don't have "two different APIs". We have just one private API.
Internal APIs (functions starts with _Py / macros starts with _PY) are just private functions in most case. It is used only for calling over compile unit. If we start all internal API public, we can't keep backward compatibility.
For example, gc module calls "_PyTuple_MaybeUntrack" https://github.com/python/cpython/blob/886483e2b9bbabf60ab769683269b873381dd...
It is highly depending on implementation detail of GC and tuple. Making it public is not make sense.
- Performance: if an optimization is important for CPython, then it's also important for third-party extensions. We don't want third-party code to be inherently slower than CPython itself. (this reminds me of PEP 580, regarding several internal optimizations which aren't available to third-party classes)
Which extension type is important as same as builtin str, tuple, int?
For example, dict implementation have special casing for str key. It's because (1) dict is heavily used for namespace, and (2) key of namespace is very likely str.
In some cases, special casing for builtin types is necessary.
In case of FASTCALL, it is private because it is under development and no one thought it's stable enough to keep backward compatibility in some future versions.
I agree that we should make FASTCALL accessible from third parties in 3.8.
But I think we should have "evolution era" for some important new APIs. And calling it from builtin modules is important for learning the API before making it public, like FASTCALL is evolved between 3.6 and 3.7.
Regards,
-- INADA Naoki <songofacandy@gmail.com>
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org
Victor Stinner schrieb am 07.09.2018 um 19:08:
I added FASTCALL to Python 3.6. I chose to make it private on purpose, we didn't have enough feedback to know if the whole idea of FASTCALL was a good idea or not. It took one or two years to cleanup the code and the exact API. METH_FASTCALL changed deeply in Python 3.7: it doesn't include keyword arguments anymore.
About Cython: the fact that it was private didn't prevent Cython to use METH_FASTCALL since Python 3.6, and so Cython was broken in Python 3.7. Well, it's a deliberate choice of Cython, and I'm fine with that.
It wasn't quite as deliberate as that. We have our own copy of PyObject_Call() which we use internally, and thus the new flag *had* to be supported there. We're not actually using FASTCALL for our own generated functions yet (which would be the deliberate choice).
I'm also interested to help to keep Cython up to date :-)
So am I. :)
Stefan
On Fri, 7 Sep 2018 at 10:12 Stefan Behnel <python_capi@behnel.de> wrote:
INADA Naoki schrieb am 07.09.2018 um 18:28:
Most of these private APIs, we don't have "two different APIs". We have just one private API.
Internal APIs (functions starts with _Py / macros starts with _PY) are just private functions in most case.
Which is ok, right? It's obvious from their name that they are not public and thus subject to change at any time (well, probably just with every minor release, though, which should also be an acceptable restriction for CPython).
- Performance: if an optimization is important for CPython, then it's also important for third-party extensions. We don't want third-party code to be inherently slower than CPython itself. (this reminds me of PEP 580, regarding several internal optimizations which aren't available to third-party classes)
Which extension type is important as same as builtin str, tuple, int?
For example, dict implementation have special casing for str key. It's because (1) dict is heavily used for namespace, and (2) key of namespace is very likely str.
And I would *love* to get access to that. Currently, it's a very hidden implementation detail that's impossible to exploit by extension modules. It's really annoying to have this optimisation (and, actually, the whole dict implementation) right in the code on the other side, but not being able to make direct use of it in Cython, e.g. for faster dict creation or iteration.
If Victor's re-organization occurs then it should be possible to open this up. But unless we can start defining clear separations between stable, public, and private then we can't expose this without a serious risk of people messing up and accidentally depending on this (hence why Jeroen's comment that "there should be just one" C API won't get you what you want ;) . But if we do get this then we could discuss making the private API more of a private/Cython API and expose more internals that you would like to have access to.
-Brett
But I think we should have "evolution era" for some important new APIs. And calling it from builtin modules is important for learning the API before making it public, like FASTCALL is evolved between 3.6 and 3.7.
What we do in Cython when exploiting some CPython internal feature is to a) implement the code that uses it twice to have a safe (and usually slower) fallback for PyPy and friends, and b) hide it behind a feature macro like "CYTHON_USE_UNICODE_INTERNALS" that users can simply set to "0" in their setup.py (or even CFLAGS) if it ever gets in the way for them.
Stefan
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org
On Sep 7, 2018, at 11:15, Brett Cannon <brett@python.org> wrote:
If Victor's re-organization occurs then it should be possible to open this up. But unless we can start defining clear separations between stable, public, and private then we can't expose this without a serious risk of people messing up and accidentally depending on this (hence why Jeroen's comment that "there should be just one" C API won't get you what you want ;) . But if we do get this then we could discuss making the private API more of a private/Cython API and expose more internals that you would like to have access to.
Regardless, we should do a better job of documenting the private API. When you develop CPython, it’s really handy to have the docs rather than having to dive into the code to remember the semantics of a particular private API.
-Barry
participants (6)
-
Barry Warsaw
-
Brett Cannon
-
INADA Naoki
-
Jeroen Demeyer
-
Stefan Behnel
-
Victor Stinner