The PyObject_CallFunction() function has a special case when the
format string is "O", to pass exactly one Python object:
* If the argument is a tuple, the tuple is unpacked: it behaves like func(*arg)
* Otherwise, it behaves like func(arg)
This case is not documented in the C API !
The following C functions have the special case:
* PyObject_CallFunction(), _PyObject_CallFunction_SizeT()
* PyObject_CallMethod(), _PyObject_CallMethod_SizeT()
* _PyObject_CallMethodId(), _PyObject_CallMethodId_SizeT()
I guess that it's a side effect of the implementation: the code uses
Py_BuildValue() and then checks if the value is a tuple or not.
Py_BuildValue() is a little bit surprising:
* "i" creates an integer object
* "ii" creates a tuple
* "(i)" and "(ii)" create a tuple.
Getting a tuple or not depends on the length of the format string. It
is not obvious when you have nested tuples like "O(OO)".
Because of the special case, passing a tuple as the only argument
requires to write "((...))" instead of just "(...)".
In the past, this special behaviour caused a bug in
generator.send(arg), probably because the author of the C code
implementing generator.send() wasn't aware of the special case. See
I found code using "O" format in the new _asyncio module, and I'm
quite sure that unpacking special case is not expected. So I opened an
Last days, I patched functions of PyObject_CallFunction() family to
use internally fast calls. I implemented the special case to keep
I replaced a lot of code using PyObject_CallFunction() with
PyObject_CallFunctionObjArgs() when the format string was only made of
"O", PyObject* arguments. I made this change to optimize the code, but
indirectly, it avoids also the special case for code which used
exactly "O" format. See:
When I made these changes, I found some functions which rely the
* time_strptime() (change 49a7fdc0d40a)
* unpickle() of _ctypes (change ceb22b8f6d32)
I don't know well what we are supposed to do. I don't think that
changing the behaviour of PyObject_CallFunction() to remove the
special case is a good idea. It would be an obvious backward
incompatible change which can break applications.
I guess that the minimum is to document the special case?
On behalf of the Python development community and the Python 3.4 and
Python 3.5 release teams, I'm delighted to announce the availability of
Python 3.4.6 and Python 3.5.3.
Python 3.4 is now in "security fixes only" mode. This is the final
stage of support for Python 3.4. Python 3.4 now only receives security
fixes, not bug fixes, and Python 3.4 releases are source code only--no
more official binary installers will be produced.
Python 3.5 is still in active "bug fix" mode. Python 3.5.3 contains
many incremental improvements over Python 3.5.2.
There were literally no code changes between rc1 and final for either
release. The only change--apart from the necessary updates from "rc1"
to final--was a single copyright notice update for one of the OS X
".plist" property list files in 3.5.3 final.
You can find Python 3.5.3 here:
And you can find Python 3.4.6 here:
It's an old feature of the weakref API that you can define an
arbitrary callback to be invoked when the referenced object dies, and
that when this callback is invoked, it gets handed the weakref wrapper
object -- BUT, only after it's been cleared, so that the callback
can't access the originally referenced object. (I.e., this callback
will never raise: def callback(ref): assert ref() is None.)
AFAICT the original motivation for this seems was that if the weakref
callback could get at the object, then the weakref callback would
effectively be another finalizer like __del__, and finalizers and
reference cycles don't mix, so weakref callbacks can't be finalizers.
There's a long document from the 2.4 days about all the terrible
things that could happen if arbitrary code like callbacks could get
unfettered access to cyclic isolates at weakref cleanup time .
But that was 2.4. In the mean time, of course, PEP 442 fixed it so
that finalizers and weakrefs mix just fine. In fact, weakref callbacks
are now run *before* __del__ methods , so clearly it's now okay for
arbitrary code to touch the objects during that phase of the GC -- at
least in principle.
So what I'm wondering is, would anything terrible happen if we started
passing still-live weakrefs into weakref callbacks, and then clearing
them afterwards? (i.e. making step 1 of the PEP 442 cleanup order be
"run callbacks and then clear weakrefs", instead of the current "clear
weakrefs and then run callbacks"). I skimmed through the PEP 442
discussion, and AFAICT the rationale for keeping the old weakref
behavior was just that no-one could be bothered to mess with it .
[The motivation for my question is partly curiosity, and partly that
in the discussion about how to handle GC for async objects, it
occurred to me that it might be very nice if arbitrary classes that
needed access to the event loop during cleanup could do something like
def __init__(self, ...):
loop = asyncio.get_event_loop()
# automatically called by the loop when I am GC'ed; async equivalent
async def aclose(self):
Right now something *sort* of like this is possible but it requires a
much more cumbersome API, where every class would have to implement
logic to fetch a cleanup callback from the loop, store it, and then
call it from its __del__ method -- like how PEP 525 does it. Delaying
weakref clearing would make this simpler API possible.]
Nathaniel J. Smith -- https://vorpus.org
On 01/14/2017 09:56 PM, Victor Stinner wrote:
> Great job! Thank you and Chi Hsuan Yen!
> Did you get feedback from users? Maybe from the Kivy community?
> Le 14 janv. 2017 18:31, "Xavier de Gaye" a écrit :
> Only a few minor issues are left to be fixed before the support of the
> Android platform may be considered. See the current status at msg285493 
> in the Android meta-issue 26865 .
Yes Chi Hsuan Yen is a great contributor to this project.
AFAIK Kivy uses the CrystaX NDK  for Python 3 instead of the
Android NDK, mainly because the Android NDK did not support wide
character until API level 21.
I was excited to see official dtrace support for python 3.6.0 on OS X, but
I have not been able to make it work:
1. I built my own python from sources on OS X 10.9, with the --with-dtrace
2. if I launch `python3.6 -q &` and then `sudo dtrace -l -P python$!`, I
get the following output:
ID PROVIDER MODULE FUNCTION NAME
2774 python48084 python3.6 _PyEval_EvalFrameDefault
2775 python48084 python3.6 _PyEval_EvalFrameDefault
2776 python48084 python3.6 collect
2777 python48084 python3.6 collect
2778 python48084 python3.6 _PyEval_EvalFrameDefault line
Which looks similar but not the same as the example given in the doc at
3. When I try to test anything with the given call_stack.d example, I can't
make it work at all:
I am not very familiar with dtrace, so maybe I a missing a step, there is a
documentation bug, or it depends on which OS X version you are using ?
tl;dr Python 3.7 is going to be faster without breaking backward
compatibility, say hello to the new "tp_fastcall" slot!
Python 3.6 got a new "FASTCALL" calling convention which allows to
avoid the creation a temporary tuple to pass positional arguments and
a temporary dictionary to pass keyword arguments. But callable objects
having a __call__() method implemented in Python don't benefit of
I tried to reuse the tp_call slot with a new flag in tp_flags, but I
had two major blocker issues:
* Deeply break the backward compatibility of the C API: calling
directly tp_call (with tuple/dict) would crash immediately if the
object uses FASTCALL
* Need to duplicate each "tp_call" function to get a new "tp_fastcall"
flavor. It wasn't easy to share the function body.
Good news, I found a new design which don't have any of these issues!
I chose to add a new tp_fastcall field to PyTypeObject and use a tiny
wrapper calling tp_fastcall for tp_call, to keep the backward
The goal is to get optimizations "for free" when calling functions.
The best expected speedup on a microbenchmark is around 1.56x faster
(-36%) when calling an object supporting FASTCALL. Example with
property_descr_get() without its "cached args" hack, result without
fastcall ("py34") compared to fastcall ("fastcall_wrapper"):
Median +- std dev: [py34] 75.0 ns +- 1.7 ns -> [fastcall_wrapper] 48.2
ns +- 1.5 ns: 1.56x faster (-36%)
But please don't expect such large speedup on macro-benchmark.
tp_fastcall allows to remove the "cached args" optimization used in
various parts of Python core, old optimizations used in performance
critical code. This hack causes various kinds of complex bugs in
corner cases which can lead to crash in the worst case.
The patch to support tp_fastcall is tiny, but you should expect a long
list of tiny changes to replace tp_call with tp_fastcall in various
Final bonus point: existing code (calling functions) doesn't need to
be modified (nor recompiled) to get speedup. Even if tp_call is
directly directly, fastcall will provide speedup, but only if it is
called only with positional arguments.
About the tp_call wrapper: keyword arguments require to convert a
Python dictionary to a C array which might be more expensive. I didn't
try to measure the performance, since this case is very rare. Almost
no C code calls functions with keyword arguments, just because it's
much more complex to pass keyword arguments, it requires too much C
code (and it's not simpler with fastcall, sorry).
https://bugs.python.org/issue29215 noticed that PEP 7 says "C++-style line
comments" are allowed, but then later says "Never use C++ style // one-line
comments." I'm assuming we are sticking with allowing C++-style comments
and the "never" link just needs an addendum to say that only applies to
code prior to Python 3.5, but I wanted to double-check before editing the
I have a C++ module that I am compiling to use inside of my Python
installation under Mac OS.
If I compile & link it against a Framework enabled Python installation, it
works fine, but if I compile & link it against a *non* enabled Framework
installation that we use for distribution, I simply get a non inspiring:
Fatal Python error: PyThreadState_Get: no current thread
I am using python-config to get my flags on both the examples, but I simply
cannot get it to run (although it compiles fine) on a *non* enabled