PyAPI_FUNC() is needed to private APIs?

Hi, all.
Some functions in cpython are private. These APIs are called from only python executable or DLL. They are not called even from extension modules in stdlib.
In this time, I want to keep _PyObject_GetMethod private. https://github.com/python/cpython/pull/14015#discussion_r293351734
As far as I know, it is used to expose the function from DLL user on Windows. So I think when the private function is not intended to be called from outside of DLL, it should not use PyAPI_FUNC. Am I collect?
Currently, many private APIs uses `PyAPI_FUNC()`. Is there any downside about having much unnecessary exported functions? (e.g. slower calling invention is used, bothering link time optimization, LoadLibrary get slower, etc...)
Regards,

Hi,
Le 13/06/2019 à 15:05, Inada Naoki a écrit :
Some functions in cpython are private. These APIs are called from only python executable or DLL. They are not called even from extension modules in stdlib.
Python 3.8 now has a better separation between "private" and "internal" APIs:
* private are "_Py" symbols which are exported in the python DLL: they should not be used, but can be used techically
* internal are symbols defined in internal header files (Include/internal/). Some symbols use PyAPI_FUNC()/PyAPI_DATA(), some only use "extern".
I'm in favor of moving towards "extern" for new internal APIs. I'm trying to keep PyAPI_FUNC/PyAPI_DATA to export symbols in DLL for things which might be useful for 3rd party tools like debuggers or profilers.
Currently, many private APIs uses `PyAPI_FUNC()`.
Well, that's mostly for historical reasons :-)
Is there any downside about having much unnecessary exported functions?
The main risk is that people start to use it, expect these APIs to be there forever, and might be surprised that their code fail with the newer Python.
My plan for http://pythoncapi.readthedocs.io/ is to *reduce* the size of the C API and hide as much as possible implementation details. Export a new function that doesn't make sense outside Python internals goes against this plan.
--
IMHO private (ex: Include/cpython/ headers) must continue to use PyAPI_FUNC/PyAPI_DATA.
I converted some *private* functions to internal functions, but I didn't always replace PyAPI_FUNC with extern.
For private functions, the contract is that they can change whenever: there is no warranty (and they must not be used ;-)).
Victor

On 13Jun2019 0643, Jeroen Demeyer wrote:
On 2019-06-13 15:36, Victor Stinner wrote:
The main risk is that people start to use it
If people use it, it should be taken as a sign that the function is useful and deserves to be public API.
If it's useful, then someone can write a justification (possibly a PEP) and we can agree to add it.
More likely it's just convenient. The cost of that convenience is that we can never optimise internals because they are now public API. Better off never to have leaked the details.
Cheers, Steve

On 2019-06-13 17:11, Steve Dower wrote:
The cost of that convenience is that we can never optimise internals because they are now public API.
I think that the opposite is true actually: the reason that people access internals is because there is no public API doing what they want. Having more public API should *reduce* the need for accessing internals.
For example, _PyObject_GetMethod is not public API but it's useful functionality. So Cython is forced to reinvent _PyObject_GetMethod (i.e. copy verbatim that function from the CPython sources), which requires accessing internals.

On Fri, Jun 14, 2019 at 12:29 AM Jeroen Demeyer J.Demeyer@ugent.be wrote:
I think that the opposite is true actually: the reason that people access internals is because there is no public API doing what they want. Having more public API should *reduce* the need for accessing internals.
For example, _PyObject_GetMethod is not public API but it's useful functionality. So Cython is forced to reinvent _PyObject_GetMethod (i.e. copy verbatim that function from the CPython sources), which requires accessing internals.
This became off topic... but.
In case of _PyObject_GetMethod, I agree that it means we don't provide *some* useful API. But it doesn't mean _PyObject_GetMethod is the missing useful API.
We don't provide method calling API which uses optimization same to LOAD_METHOD. Which may be like this:
/* methname is Unicode, nargs > 0, and args[0] is self. */ PyObject_VectorCallMethod(PyObject *methname, PyObject **args, Py_ssize_t nargs, PyObject *kwds)
(Would you try adding this? Or may I?)
Anyway, do you think _PyObject_GetMethod is useful even after we provide PyObject_VectorCallMethod ?
I'd like to put _PyObject_GetMethod in private/ (with PyAPI_FUNC) in 3.9, for users like Cython.
If you want to make it public in 3.9, please create new thread. Let's discuss about how it is useful, and current name and signature are good enough to make it public.
Regards,

On 2019-06-13 18:03, Inada Naoki wrote:
We don't provide method calling API which uses optimization same to LOAD_METHOD. Which may be like this:
/* methname is Unicode, nargs > 0, and args[0] is self. */ PyObject_VectorCallMethod(PyObject *methname, PyObject **args, Py_ssize_t nargs, PyObject *kwds)
I agree that this would be useful. Minor nitpick: we spell "Vectorcall" with a lower-case "c".
There should also be a _Py_Identifier variant _PyObject_VectorcallMethodId
The implementation should be like vectorcall_method from Objects/typeobject.c except that _PyObject_GetMethod should be used instead of lookup_method() (the difference is that the code for special methods like __add__ only looks at the attributes of the type, not the instance).
(Would you try adding this? Or may I?)
Or course you may. Just let me know if you're working on it.

On 13Jun2019 0816, Jeroen Demeyer wrote:
On 2019-06-13 17:11, Steve Dower wrote:
The cost of that convenience is that we can never optimise internals because they are now public API.
I think that the opposite is true actually: the reason that people access internals is because there is no public API doing what they want. Having more public API should *reduce* the need for accessing internals.
Right, but we need to know what API that is. We can't just make everything public by default.
For example, _PyObject_GetMethod is not public API but it's useful functionality. So Cython is forced to reinvent _PyObject_GetMethod (i.e. copy verbatim that function from the CPython sources), which requires accessing internals.
What's wrong with using PyObject_GetAttr() and then doing PyType_HasFeature(Py_TYPE(descr), Py_TPFLAGS_METHOD_DESCRIPTOR) on the result?
More importantly, why do you need to know that it's a method descriptor and not just a callable object that can be accessed via an attribute? And is this something that's generally needed, or is Cython just special (I expect Cython to be special in many situations, which is why we have a tiered API and expect Cython-generated code to be regenerated for different CPython versions).
Once we start discussing actual API needs, we can get to real designs. Simply making everything public by default does not improve any designs.
Cheers, Steve

On Fri., 14 Jun. 2019, 2:05 am Steve Dower, steve.dower@python.org wrote:
On 13Jun2019 0816, Jeroen Demeyer wrote:
On 2019-06-13 17:11, Steve Dower wrote:
The cost of that convenience is that we can never optimise internals because they are now public API.
I think that the opposite is true actually: the reason that people access internals is because there is no public API doing what they want. Having more public API should *reduce* the need for accessing internals.
Right, but we need to know what API that is. We can't just make everything public by default.
For example, _PyObject_GetMethod is not public API but it's useful functionality. So Cython is forced to reinvent _PyObject_GetMethod (i.e. copy verbatim that function from the CPython sources), which requires accessing internals.
What's wrong with using PyObject_GetAttr() and then doing PyType_HasFeature(Py_TYPE(descr), Py_TPFLAGS_METHOD_DESCRIPTOR) on the result?
More importantly, why do you need to know that it's a method descriptor and not just a callable object that can be accessed via an attribute? And is this something that's generally needed, or is Cython just special (I expect Cython to be special in many situations, which is why we have a tiered API and expect Cython-generated code to be regenerated for different CPython versions).
We don't expect most Cython code to be regenerated for different versions - we only expect it to be recompiled, as with any other extension.
Hence Jeroen's point: if something is useful enough for Cython to want to use it, it makes to provide a public API for it that hides any internal implementation details that may not remain stable across releases.
Cheers, Nick.

On Sun, Jun 30, 2019 at 12:26 AM Nick Coghlan ncoghlan@gmail.com wrote:
Hence Jeroen's point: if something is useful enough for Cython to want to use it, it makes to provide a public API for it that hides any internal implementation details that may not remain stable across releases.
I wanted to discuss about only when PyAPI_FUNC() is needed, not about which function should be public.
But FYI, we have moved _PyObject_GetMethod to private to cpython API already.
We don't expect most Cython code to be regenerated for different versions - we only expect it to be recompiled, as with any other extension.
We don't make some unstable API public to avoid breaking packages. But it seems Cython choose performance over stable source code. It seems "regenerate source code when it broken" is Cython policy. It is out of our control.
For example, FastCall APIs were not public because we don't think it's not stable yet and it can be changed in future versions. But Cython used it. Luckily, it isn't broken so much. But it is just lucky.
I hope Cython provides option to produce more stable source code for projects distributing generated C source code instead of a binary wheel or Cython source code.
Regards,

Le 02/07/2019 à 06:35, Inada Naoki a écrit :
I wanted to discuss about only when PyAPI_FUNC() is needed, not about which function should be public.
On Unix, PyAPI_FUNC() or extern is basically the same currently:
#define PyAPI_FUNC(RTYPE) RTYPE #define PyAPI_DATA(RTYPE) extern RTYPE
On Windows, PyAPI_FUNC() exports the symbol in the Python DLL ("dllexport"), whereas by default symbols are not exported (cannot by used outside Python binary). Without PyAPI_FUNC(), a symbol cannot be used on Windows. Macros when building Python:
#define PyAPI_FUNC(RTYPE) __declspec(dllexport) RTYPE #define PyAPI_DATA(RTYPE) extern __declspec(dllexport) RTYPE
Macros when using Python headers:
#define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE #define PyAPI_DATA(RTYPE) extern __declspec(dllimport) RTYPE
Victor

On Thu, Jun 13, 2019 at 10:37 PM Victor Stinner vstinner@redhat.com wrote:
Python 3.8 now has a better separation between "private" and "internal" APIs:
- private are "_Py" symbols which are exported in the python DLL: they
should not be used, but can be used techically
- internal are symbols defined in internal header files
(Include/internal/). Some symbols use PyAPI_FUNC()/PyAPI_DATA(), some only use "extern".
Thank you for clarifying. I confused always about it.
I'm in favor of moving towards "extern" for new internal APIs. I'm trying to keep PyAPI_FUNC/PyAPI_DATA to export symbols in DLL for things which might be useful for 3rd party tools like debuggers or profilers.
Hmm, debugger or profiler need this? I thought only loaders and DLL explorers use this.
Currently, many private APIs uses `PyAPI_FUNC()`.
Well, that's mostly for historical reasons :-)
OK, I see.
Is there any downside about having much unnecessary exported functions?
The main risk is that people start to use it, expect these APIs to be there forever, and might be surprised that their code fail with the newer Python.
I know, and I agree with both of you and Jeroen.
But I concern about performance, stack memory usage, and binary size. By quick googling, I find some answers.
https://docs.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=vs-2019
It seems dllexport doesn't affect to calling convention.
https://clang.llvm.org/docs/LTOVisibility.html https://devblogs.microsoft.com/oldnewthing/?p=2123
It seems dllexport affects linker. At least, linker can not remove dllexport-ed function even if the function is not called anywhere in the DLL.
Regards,

On Thu, 13 Jun 2019 22:05:01 +0900 Inada Naoki songofacandy@gmail.com wrote:
Currently, many private APIs uses `PyAPI_FUNC()`.
It's easier to use PyAPI_FUNC() everywhere than forget it in some places and then bother Windows users.
Private APIs are sometimes used in third-party modules if they want access to higher-performance or specific APIs.
Regards
Antoine.
participants (6)
-
Antoine Pitrou
-
Inada Naoki
-
Jeroen Demeyer
-
Nick Coghlan
-
Steve Dower
-
Victor Stinner