PyObject_CallOneArg(), ...TwoArgs()
Hello all,
Victor Stinner recently added the function PyObject_CallNoArgs() for calling an object without any arguments, see https://docs.python.org/3.9/c-api/object.html#c.PyObject_CallNoArgs
The next obvious question is: should we have PyObject_CallOneArg(), PyObject_CallTwoArgs()? Of course, this cannot continue forever, so I suggest adding the 1 and 2 argument variants which would cover most use cases. Cython has something like that and it becomes very natural to use these functions.
The main advantage is that we can implement these variants much more efficiently than the existing PyObject_CallFunction() or PyObject_CallFunctionObjArgs().
I don't think that it's a good idea. It would require too many functions for all combinations. Like passing an argument as keyword, handling methods, etc. Why only 2? Why not up to 5? :-)
IMHO calling a function without any arguments is common enough to justify PyObject_CallNoArgs(). And it's a single function.
If you want efficient function calls, use Cython which should use the most efficient available API ;-)
Victor
Le mar. 18 juin 2019 à 15:26, Jeroen Demeyer <J.Demeyer@ugent.be> a écrit :
Hello all,
Victor Stinner recently added the function PyObject_CallNoArgs() for calling an object without any arguments, see https://docs.python.org/3.9/c-api/object.html#c.PyObject_CallNoArgs
The next obvious question is: should we have PyObject_CallOneArg(), PyObject_CallTwoArgs()? Of course, this cannot continue forever, so I suggest adding the 1 and 2 argument variants which would cover most use cases. Cython has something like that and it becomes very natural to use these functions.
The main advantage is that we can implement these variants much more efficiently than the existing PyObject_CallFunction() or PyObject_CallFunctionObjArgs().
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org
-- Night gathers, and now my watch begins. It shall not end until my death.
On 2019-06-18 15:42, Victor Stinner wrote:
I don't think that it's a good idea. It would require too many functions for all combinations. Like passing an argument as keyword, handling methods, etc. Why only 2? Why not up to 5? :-)
By that argument, we shouldn't have added PyObject_CallNoArgs() either. I think that supporting OneArgs and TwoArgs is a reasonable compromise.
And yes, I also want to do the same thing for methods (PyObject_CallMethodObjNoArgs, ...).
If you want efficient function calls, use Cython which should use the most efficient available API ;-)
I'm talking about code internal to CPython, where using Cython is unfortunately not an option.
On Tue, Jun 18, 2019 at 7:19 AM Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
On 2019-06-18 15:42, Victor Stinner wrote:
I don't think that it's a good idea. It would require too many functions for all combinations. Like passing an argument as keyword, handling methods, etc. Why only 2? Why not up to 5? :-)
By that argument, we shouldn't have added PyObject_CallNoArgs() either. I think that supporting OneArgs and TwoArgs is a reasonable compromise.
I think Victor's point is why 1 positional argument and two positional arguments? Why not 1 keyword argument? Or 1 positional and 1 keyword argument?
How hard would it be to analyze the cpython repo and see how common the various ways of calling code is? That way we can pragmatically just see what would be the most useful and not guess (my gut backs up Jeroen, although I would name it PyObject_OnePositionalArg() or something to be more explicit).
-Brett
And yes, I also want to do the same thing for methods (PyObject_CallMethodObjNoArgs, ...).
If you want efficient function calls, use Cython which should use the most efficient available API ;-)
I'm talking about code internal to CPython, where using Cython is unfortunately not an option.
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org
While its API is not perfect (need to declare a local array of arguments), _PyObject_FastCall() is efficient for an arbitrary number of positional arguments.
Victor
-- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, Jun 19, 2019 at 6:47 AM Victor Stinner <vstinner@redhat.com> wrote:
While its API is not perfect (need to declare a local array of arguments), _PyObject_FastCall() is efficient for an arbitrary number of positional arguments.
_PyObject_CallVaArgs(..., NULL) with variable number of positional arguments, null terminated?
My experience with mixing Python and C is that I never want to use keyword arguments in C code calling Python. At this low level I don't think it is imposing on the programmer to know exactly what args they need to use. Or maybe it's just the C mindset.
I can imagine a method/function implemented in C that needs to pass through keyword arguments to another method or function, but that would be as a single dict.
--
cheers,
Hugh Fisher
Le mer. 19 juin 2019 à 00:11, Hugh Fisher <hugo.fisher@gmail.com> a écrit :
_PyObject_CallVaArgs(..., NULL) with variable number of positional arguments, null terminated?
Such function already exists: PyObject_CallFunctionObjArgs().
PyObject_CallNoArgs() was justified by a lower usage of the C stack memory. PyObject_CallFunctionObjArgs() uses more stack than PyObject_CallNoArgs(), since the number of arguments is unknown: you have to scan "vargs" for NULL terminator.
PyObject_CallNoArgs() doesn't have this issue: it always have zero argument. _PyObject_FastCall() neither: the number of arguments is passed... as an argument ;-)
Example:
PyObject* args[] = {arg1, arg2};
res = _PyObject_FastCall(func, args, Py_ARRAY_LENGTH(args));
They are many ways to write the same code:
PyObject* args[2] = {arg1, arg2};
res = _PyObject_FastCall(func, args, 2);
Victor
On 2019-06-19 00:59, Victor Stinner wrote:
Example:
PyObject* args[] = {arg1, arg2}; res = _PyObject_FastCall(func, args, Py_ARRAY_LENGTH(args));
They are many ways to write the same code:
PyObject* args[2] = {arg1, arg2}; res = _PyObject_FastCall(func, args, 2);
Right, but it's slightly inconvenient because you need to define the array separately. The most common way in the CPython sources to call a function with known positional arguments is PyObject_FunctionObjArgs(), which is less efficient. Part of the reason why PyObject_FunctionObjArgs() is used is simply because it's easier to use than _PyObject_FastCall().
On 2019-06-18 21:11, Brett Cannon wrote:
How hard would it be to analyze the cpython repo and see how common the various ways of calling code is?
I did a quick count of calls being done using PyObject_CallFunction(), PyObject_CallFunctionObjArgs() and _PyObject_FastCall():
- There are 94 calls with 1 positional argument.
- There are 20 calls with 2 positional arguments.
- There are 11 calls with 3 positional arguments.
- There are 4 calls with 4 or more positional arguments.
Personally, I think that this is sufficient to justify the addition of the extra convenience functions for 1, 2 and maybe even 3 positional arguments.
Le mer. 19 juin 2019 à 10:39, Jeroen Demeyer <J.Demeyer@ugent.be> a écrit :
I did a quick count of calls being done using PyObject_CallFunction(), PyObject_CallFunctionObjArgs() and _PyObject_FastCall():
- There are 94 calls with 1 positional argument.
- There are 20 calls with 2 positional arguments.
- There are 11 calls with 3 positional arguments.
- There are 4 calls with 4 or more positional arguments.
Personally, I think that this is sufficient to justify the addition of the extra convenience functions for 1, 2 and maybe even 3 positional arguments.
Every single new function added to the C API is causing a large maintenance cost on the whole Python community. First, CPython will have to maintain it. Then other Python implementation like PyPy will have to implement it and maintain it as well. The current trend is to *reduce* the size of the C API rather than making it larger. http://pythoncapi.readthedocs.io/
*But* performance matters too. *Maybe* we can experiment _PyObject_CallOneArg(func, arg): check if it's faster or not and try to measure the stack consumption.
I don't think that we should add functions for 2 arguments or more. It's too rare and would cause too much maintenance burden compared to the benefit.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 2019-06-19 11:43, Victor Stinner wrote:
The current trend is to *reduce* the size of the C API rather than making it larger. http://pythoncapi.readthedocs.io/
You speak of "the C API" but what do you mean really? I know that you care mostly about the limited API/stable ABI, but I personally care mostly about the default API (when neither Py_LIMITED_API not Py_BUILD_CORE is defined).
Which API does PyPy care about?
You always say that you don't want C extensions to access implementation details. But in some other thread (I don't recall precisely where) I argued that C extensions access implementation details because they are missing a C API function doing what they need. This is certainly true for Cython. So if you want to fix that, we should rather *increase* our C API to allow clean access to those implementation details.
Jeroen.
Le mer. 19 juin 2019 à 12:03, Jeroen Demeyer <J.Demeyer@ugent.be> a écrit :
On 2019-06-19 11:43, Victor Stinner wrote:
The current trend is to *reduce* the size of the C API rather than making it larger. http://pythoncapi.readthedocs.io/
You speak of "the C API" but what do you mean really? I know that you care mostly about the limited API/stable ABI, but I personally care mostly about the default API (when neither Py_LIMITED_API not Py_BUILD_CORE is defined).
Which API does PyPy care about?
The API used by C extension.
Apart PyQt, I'm not aware of any C extension using the limited ABI (stable ABI).
You always say that you don't want C extensions to access implementation details. But in some other thread (I don't recall precisely where) I argued that C extensions access implementation details because they are missing a C API function doing what they need. This is certainly true for Cython. So if you want to fix that, we should rather *increase* our C API to allow clean access to those implementation details.
This problem is hard and that's why it's not fixed yet :-) It's a work-in-progress.
My advice would be to require a discussion before adding a new function. Previously, we added functions without thinking if it makes sense outside CPython or not, simply because there was no clear separation between the "public" and the "private" API.
I introduced a clear separation in Python 3.8: https://docs.python.org/3.8/whatsnew/3.8.html#build-and-c-api-changes
Now we can only add a function to the *internal* C API if it must only be used inside CPython. We can "experiment" new functions there before exposing them in the public API or the private API.
Today I removed one more function from the C API :-) PyImport_Cleanup().
Another example:
commit 0a28f8d379544eee897979da0ce99f0b449b49dd Author: Victor Stinner <vstinner@redhat.com> Date: Wed Jun 19 02:54:39 2019 +0200
bpo-36710: Add tstate parameter in import.c (GH-14218)
* Add 'tstate' parameter to many internal import.c functions.
* _PyImportZip_Init() now gets 'tstate' parameter rather than
'interp'.
* Add 'interp' parameter to _PyState_ClearModules() and rename it
to _PyInterpreterState_ClearModules().
* Move private _PyImport_FindBuiltin() to the internal C API; add
'tstate' parameter to it.
* Remove private _PyImport_AddModuleObject() from the C API:
use public PyImport_AddModuleObject() instead.
* Remove private _PyImport_FindExtensionObjectEx() from the C API:
use private _PyImport_FindExtensionObject() instead.
_PyImport_AddModuleObject() and _PyImport_FindExtensionObjectEx() were added to the C API because they were used outside import.c, not because it makes sense to use them outside CPython.
Previously, technically, these functions were exported even if their name is prefixed by "_Py". Exposing these functions cause issues inside CPython, because I wanted to modify them to pass a new "tstate" parameter and so change their API. Hopefully, they are private so we are free to break their API ;-)
Victor
Night gathers, and now my watch begins. It shall not end until my death.
Right :-)
Le mer. 19 juin 2019 à 12:29, Jeroen Demeyer <J.Demeyer@ugent.be> a écrit :
On 2019-06-19 12:17, Victor Stinner wrote:
My advice would be to require a discussion before adding a new function.
I hope that you mean just that the PR reviewer should check this, not a more formal PEP-style discussion.
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org
-- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, Jun 19, 2019 at 2:45 AM Victor Stinner <vstinner@redhat.com> wrote:
Le mer. 19 juin 2019 à 10:39, Jeroen Demeyer <J.Demeyer@ugent.be> a écrit :
I did a quick count of calls being done using PyObject_CallFunction(), PyObject_CallFunctionObjArgs() and _PyObject_FastCall():
- There are 94 calls with 1 positional argument.
- There are 20 calls with 2 positional arguments.
- There are 11 calls with 3 positional arguments.
- There are 4 calls with 4 or more positional arguments.
Personally, I think that this is sufficient to justify the addition of the extra convenience functions for 1, 2 and maybe even 3 positional arguments.
Every single new function added to the C API is causing a large maintenance cost on the whole Python community. First, CPython will have to maintain it. Then other Python implementation like PyPy will have to implement it and maintain it as well. The current trend is to *reduce* the size of the C API rather than making it larger. http://pythoncapi.readthedocs.io/
*But* performance matters too. *Maybe* we can experiment _PyObject_CallOneArg(func, arg): check if it's faster or not and try to measure the stack consumption.
I don't think that we should add functions for 2 arguments or more. It's too rare and would cause too much maintenance burden compared to the benefit.
I agree with both of what Victor said: let's see if a one-arg version has a benefit and the counts for the other possibilities don't seem to me to be high enough to bother with the analysis.
On 2019-06-19 20:49, Brett Cannon wrote:
I agree with both of what Victor said: let's see if a one-arg version has a benefit and the counts for the other possibilities don't seem to me to be high enough to bother with the analysis.
We now have (uses counted in all .c files):
- _PyObject_CallNoArg and PyObject_CallNoArgs (84 uses)
- _PyObject_CallOneArg (96 uses)
- _PyObject_CallMethodNoArgs (34 uses)
To complement these, I like to add _PyObject_CallMethodOneArg, which would have 39 uses if I counted correctly.
On 08Jul2019 1216, Jeroen Demeyer wrote:
On 2019-06-19 20:49, Brett Cannon wrote:
I agree with both of what Victor said: let's see if a one-arg version has a benefit and the counts for the other possibilities don't seem to me to be high enough to bother with the analysis.
We now have (uses counted in all .c files):
- _PyObject_CallNoArg and PyObject_CallNoArgs (84 uses)
- _PyObject_CallOneArg (96 uses)
- _PyObject_CallMethodNoArgs (34 uses)
To complement these, I like to add _PyObject_CallMethodOneArg, which would have 39 uses if I counted correctly.
Did we try it and see if there's an actual performance benefit? (I dislike guessing what compilers are doing here, or even looking at the generated code, as processors can often perform better with code that looks unintuitive to a human reader.)
I would also appreciate if we included more than just gcc as the benchmark for "what compilers do". Adding clang would satisfy me, though of course I'd be happiest if MSVC was also tested.
In general, I prefer to not expand even the internal C API unless there's a significant benefit, and the burden of proof is on those who are proposing the expansion.
Cheers, Steve
On 2019-07-11 07:19, Steve Dower wrote:
Did we try it and see if there's an actual performance benefit?
For _PyObject_CallOneArg(), I benchmarked __missing__ (with GCC) and it improved performance: https://bugs.python.org/issue37483#msg347217
I would also appreciate if we included more than just gcc as the benchmark for "what compilers do". Adding clang would satisfy me, though of course I'd be happiest if MSVC was also tested.
Some days ago I tested C varargs functions in general and Clang was worse than GCC. I haven't tested MSVC.
In general, I prefer to not expand even the internal C API unless there's a significant benefit, and the burden of proof is on those who are proposing the expansion.
What's "significant"? This is really hard to define, something that is significant for one use case may be completely irrelevant for another use case.
On 11Jul2019 1032, Jeroen Demeyer wrote:
On 2019-07-11 07:19, Steve Dower wrote: >> I would also appreciate if we included more than just gcc as the
benchmark for "what compilers do". Adding clang would satisfy me, though of course I'd be happiest if MSVC was also tested.
Some days ago I tested C varargs functions in general and Clang was worse than GCC. I haven't tested MSVC.
Thanks, good to know.
In general, I prefer to not expand even the internal C API unless there's a significant benefit, and the burden of proof is on those who are proposing the expansion.
What's "significant"? This is really hard to define, something that is significant for one use case may be completely irrelevant for another use case.
Proving that the use case is significant is a very good start (where use case means "real world scenario" rather than "micro-benchmark"). Otherwise, we have to discuss complexity and maintainability versus general performance improvement. But if there's a clear benefit for a scenario that matters to users, that makes any improvement more compelling.
Cheers, Steve
Jeroen Demeyer schrieb am 18.06.19 um 15:25:
Victor Stinner recently added the function PyObject_CallNoArgs() for calling an object without any arguments, see https://docs.python.org/3.9/c-api/object.html#c.PyObject_CallNoArgs
The next obvious question is: should we have PyObject_CallOneArg(), PyObject_CallTwoArgs()? Of course, this cannot continue forever, so I suggest adding the 1 and 2 argument variants which would cover most use cases. Cython has something like that and it becomes very natural to use these functions.
The main advantage is that we can implement these variants much more efficiently than the existing PyObject_CallFunction() or PyObject_CallFunctionObjArgs().
For internal use in CPython, we could also add a generic inline function
_PyObject_CallObjectPosArgs(PyObject *func, int nargs, ...)
and special-case its implementation with a switch statement for nargs in (0,1,2). The C-compiler should be more than happy to generate the expected code for us.
Whether that should then become an official C-API function, well, it could, I guess.
Stefan
On 2019-06-23 08:45, Stefan Behnel wrote:
For internal use in CPython, we could also add a generic inline function
_PyObject_CallObjectPosArgs(PyObject *func, int nargs, ...)
and special-case its implementation with a switch statement for nargs in (0,1,2). The C-compiler should be more than happy to generate the expected code for us.
I tested this on simple proof-of-concept code and the generated code is far from optimal. So unfortunately this isn't a good solution.
GCC never inlines a varargs function. But it's still reasonable: it doesn't generate code for va_start() and va_arg() if it doesn't need to. In particular, if all va_arg() calls are in a predicable order, then the function behaves as if the arguments were passed normally and it doesn't generate code for va_start() and va_arg().
Clang is much worse: it doesn't seem to apply any optimizations, it always fully executes va_start().
Jeroen Demeyer schrieb am 26.06.19 um 17:08:
On 2019-06-23 08:45, Stefan Behnel wrote:
For internal use in CPython, we could also add a generic inline function
_PyObject_CallObjectPosArgs(PyObject *func, int nargs, ...)
and special-case its implementation with a switch statement for nargs in (0,1,2). The C-compiler should be more than happy to generate the expected code for us.
I tested this on simple proof-of-concept code and the generated code is far from optimal. So unfortunately this isn't a good solution.
GCC never inlines a varargs function. But it's still reasonable: it doesn't generate code for va_start() and va_arg() if it doesn't need to. In particular, if all va_arg() calls are in a predicable order, then the function behaves as if the arguments were passed normally and it doesn't generate code for va_start() and va_arg().
Clang is much worse: it doesn't seem to apply any optimizations, it always fully executes va_start().
Thanks for trying out the idea, Jeroen.
Stefan
participants (6)
-
Brett Cannon
-
Hugh Fisher
-
Jeroen Demeyer
-
Stefan Behnel
-
Steve Dower
-
Victor Stinner