On 2018-07-31 09:36, INADA Naoki wrote:
> I want to see PoC of direct C calling.
To be honest, there is no implementation plan for this yet. I know that
several people want this feature, so it makes sense to think about it.
For me personally, the main open problem is how to deal with arguments
which may be passed both as Python object or as native C type. For
example, when doing a function call like f(1,2,3), it may happen that
the first argument is really a Python object (so it should be passed as
Python int) but that the other two arguments are C integers.
> And I think PoC can be implemented without waiting PEP 580.
For one particular class (say CyFunction), yes. But this feature would
be particularly useful for calling between different kinds of C code,
for example between Numba and CPython built-ins, or between Pythran and
That is why I think it should be implemented as an extension of PEP 580.
Anyway, this is a different subject that we should not mix in the
discussion about PEP 580 (that is also why I am replying to this
specific point separately).
First of all, I'm sorry to I forgot change my mail title.
(I though about reserving one more slot for Cython for
further Cython-to-Cython call optimization, but I rejected
my idea because I'm not sure it really help Cython.)
On Mon, Jul 30, 2018 at 11:55 PM Jeroen Demeyer <J.Demeyer(a)ugent.be> wrote:
> On 2018-07-30 15:35, INADA Naoki wrote:
> > As repeatedly said, PEP 580 is very complicated protocol
> > when just implementing callable object.
> Can you be more concrete what you find complicated? Maybe I can improve
> the PEP to explain it more. Also, I'm open to suggestions to make it
> less complicated.
When thinking from extension writer's point of view, almost all of PEP 580 is
complicated comparing PEP 576. Remember they don't need custom method/function
type. So PEP 576/580 are needed only when implementing callable object,
like itemgetter or lru_cache in stdlib.
* We continue to use PyMethodDef and METH_* when writing
tp_methods. They should learn PyCCallDef and CCALL_* flags in addition
to PyMethodDef and METH_*.
* In PEP 576, just put function pointer to type slot. On the other hand, when
implementing callable object with PEP 580, (1) Put PyCCallDef somewhere,
(2) Put CCallRoot in instance, (3) put offset of (2) to tp_ccall.
* Difference between cc_parent and cc_self are unclear too.
I think PEP 580 is understandable only for people who tried to implement
method objects. It's complete rewrite of PyCFunction and method_descriptor.
But extension author can write extension without knowing implementation of
> > It is optimized for implementing custom method object, although
> > almost only Cython want the custom method type.
> For the record, Numba also seems interested in the PEP:
OK, Numba developer interested in:
* Supporting FASTCALL for Dispatcher type: PEP 576 is more simple
for it as I described above.
* Direct C function calling (skip PyObject calling abstraction).
While it's not part of PEP 580, it's strong motivation for PEP 580.
I want to see PoC of direct C calling.
And I think PoC can be implemented without waiting PEP 580.
* Cython can have specialization for CyFunction, like it have for CFunction.
(Note that Cython doesn't utilize LOAD_METHOD / CALL_METHOD for
CFunction too. So lacking support for LOAD_METHOD / CALL_METHOD
is not a big problem for now.)
* Cython can implement own C signature and embed it in CyFunction.
After that, we (including Numba, Cython, and PyPy developers) can discuss
how portable C signature can be embedded in PyCCallDef.
> > I'm not sure adding such complicated protocol almost only for Cython.
> > If CyFunction can be implemented behind PEP 576, it may be better.
> I recall my post
> explaining the main difference between PEP 576 and PEP 580.
I wrote my mail after reading the mail, of course.
But it was unclear without reading PEP and implementation carefully.
For example, "hook which part" seems meta-discussion to me before
reading your implementation.
I think only way to understand PEP 580 is reading implementation
and imagine how Cython and Numba use it.
> I would like to stress that PEP 580 was designed for maximum
> performance, both today and for future extensions (such as calling with
> native C types).
I don't know what the word *stress* mean here. (Sorry, I'm not good at English
enough for such hard discussion).
But I want to see PoC of real benefit of PEP 580, as I said above.
> > * PEP 576 and 580 are not strictly mutually exclusive; PEP 576 may be
> > accepted in addition to PEP 580
> I don't think that this is a good idea: you will mostly end up with the
> disadvantages of both approaches.
Hm, My point was providing easy and simple way to support FASTCALL
in callable object like functools.partial or functools.lru_cache.
But it should be discussed after PEP 580.
INADA Naoki <songofacandy(a)gmail.com>
I’m looking at PyOS_CheckStack because this feature might be useful on macOS (and when I created bpo-33955 for this someone ran with it and created a patch).
Does anyone remember why the interpreter raises MemoryError and not RecursionError when PyOS_CheckStack detects that we’re about to run out of stack space?
The reason I’m looking into this is that the default stack size on macOS is fairly small and I’d like to avoid crashing the interpreter when running out of stackspace on threads created by the system (this is less of a risk on threads created by Python itself because we can arrange for a large enough stack in that case).
Currently C API is not completely covered by tests. Tests for particular
parts of C API are scattered through different files. There are files
completely purposed for testing C API (like test_capi.py,
test_getargs2.py), there are classes (usually having "CAPI" in the name)
in different files for testing C API specific for unicode, marshal.
Argument parsing tests are split between two files, test_capi.py,
I need to add new tests for new features, and I'm going to add new tests
for existing C API. But first I'm going to reorganize tests. Add a new
directory Lib/test/test_capi, and move all C API tests into it, grouped
by function prefixes. test_getargs.py for testing PyArg_*(),
test_unicode.py for testing PyUnicode_*(), etc. Tests that use the
_testcapi module, but don't test specific C API, will left on place.
The benefit is that it will be easier to run all C API tests at once,
and only them, and it will be clearer what C API is covered by tests.
The disadvantage is that you will need to run several files for testing
marshal for example.
What are your thoughts?
I'm just easing back into core development work by trying to get a
stable testing environment for Python development on Windows.
One problem is that certain tests use support.TESTFN (a local directory
constructed from the pid) for output files etc. However this can cause
issues on Windows when recreating the folders / files for multiple
tests, especially when running in parallel.
Here's an example on my laptop deliberately running 3 tests with -j0
which I know will generate an error about one time in three:
C:\work-in-progress\cpython>python -mtest -j0 test_urllib2 test_bz2
Running Debug|Win32 interpreter...
Run tests in parallel using 6 child processes
0:00:23 [1/3/1] test_urllib2 failed
test test_urllib2 failed -- Traceback (most recent call last):
File "C:\work-in-progress\cpython\lib\test\test_urllib2.py", line
821, in test_file
f = open(TESTFN, "wb")
PermissionError: [Errno 13] Permission denied: '@test_15564_tmp'
Although these errors are both intermittent and fairly easily spotted,
the effect is that I rarely get a clean test run when I'm applying a patch.
I started to address this years ago but things stalled. I'm happy to
pick this up again and have another go, but I wanted to ask first
whether there was any objection to my converting tests to using tempfile
functions which should avoid the problem?
I noticed an inconsistency in the error messages for the number of
arguments to a method call. For Python methods, the "self" argument is
counted. For built-in methods, the "self" argument is *not* counted:
>>> class mylist(list):
... def append(self, val): super().append(val)
>>> f = list().append
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: append() takes exactly one argument (2 given)
>>> g = mylist().append
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: append() takes 2 positional arguments but 3 were given
I think it has been argued before that it's a feature that self is
counted. So I consider the error message for list().append a bug. This
is one of the many oddities I noticed while working on improving
Would you agree to change the error message for built-in methods to be
closer to Python methods?
On 2018-07-30 15:35, INADA Naoki wrote:
> As repeatedly said, PEP 580 is very complicated protocol
> when just implementing callable object.
Can you be more concrete what you find complicated? Maybe I can improve
the PEP to explain it more. Also, I'm open to suggestions to make it
> It is optimized for implementing custom method object, although
> almost only Cython want the custom method type.
For the record, Numba also seems interested in the PEP:
> I'm not sure adding such complicated protocol almost only for Cython.
> If CyFunction can be implemented behind PEP 576, it may be better.
I recall my post
explaining the main difference between PEP 576 and PEP 580.
I would like to stress that PEP 580 was designed for maximum
performance, both today and for future extensions (such as calling with
native C types).
> * PEP 576 and 580 are not strictly mutually exclusive; PEP 576 may be
> accepted in addition to PEP 580
I don't think that this is a good idea: you will mostly end up with the
disadvantages of both approaches.
I have been fuzzing various parts of Python standard library for Python 3.7 with python-afl to find out internal implementation issues that exist in the library. What I have been looking for are mainly following:
* Exceptions that are something else than the documented ones. These usually indicate an internal implementation issue. For example one would not expect an UnicodeDecodeError from netrc.netrc() function when the documentation promises netrc.NetrcParseError and there is no way to pass properly sanitized file object to the netrc.netrc().
* Differences between values returned by C and Python versions of some functions. quopri module may have these.
* Unexpected performance and memory allocation issues. These can be somewhat controversial to fix, if at all, but at least in some cases from end-user perspective it can be really nasty if for example fractions.Fraction("1.64E6646466664") results in hundreds of megabytes of memory allocated and takes very long to calculate. I gave up waiting for that function call to finish after 5 minutes.
As this is going to result in a decent amount of bug reports (currently I only filed one, although that audio processing area has much more issues to file), I would like to ask your opinion on filing these bug reports. Should I report all issues regarding some specific module in one bug report, or try to further split them into more fine grained reports that may be related? These different types of errors are specifically noticeable in zipfile module that includes a lot of different exception and behavioral types on invalid data <https://github.com/Barro/python-stdlib-fuzzers/tree/master/zipfile/crashes> . And in case of sndhdr module, there are multiple modules with issues (aifc, sunau, wave) that then show up also in sndhdr when they are used. Or are some of you willing to go through the crashes that pop up and help with the report filing?
The code and more verbose description for this is available from <https://github.com/Barro/python-stdlib-fuzzers>. It works by default on some GNU/Linux systems only (I use Debian testing), as it relies on /dev/shm/ being available and uses shell scripts as wrappers that rely on various tools that may not be installed on all systems by default.
As a bonus, as this uses coverage based fuzzing, it also opens up the possibility of automatically creating a regression test suite for each of the fuzzed modules to ensure that the existing functionality (input files under <fuzz-target>/corpus/ directory) does not suddenly result in additional exceptions and that it is more easy to test potential bug fixes (crash inducing files under <fuzz-target>/crashes/ directory).
As a downside, this uses two quite specific tools (afl, python-afl) that have further dependencies (Cython) inside them, I doubt the viability of integrating this type of testing as part of normal Python verification process. As a difference to libFuzzer based fuzzing that is already integrated in Python, this instruments the actual (and only the) Python code and not the actions that the interpreter does in the background. So this should result in better fuzzer coverage for Python code that is used with the downside that when C functions are called, they are complete black boxes to the fuzzer.
I have mainly run these fuzzer instances at most for several hours per module with 4 instances and stopped running no-issue modules after there have been no new coverage discovered after more than 10 minutes. Also I have not really created high quality initial input files, so I wouldn't be surprised if there are more issues lurking around that could be found with throwing more CPU and higher quality fuzzers at the problem.
As repeatedly said, PEP 580 is very complicated protocol
when just implementing callable object.
It is optimized for implementing custom method object, although
almost only Cython want the custom method type.
I'm not sure adding such complicated protocol almost only for Cython.
If CyFunction can be implemented behind PEP 576, it may be better.
On the other hand, most complexity of PEP 580 is not newly added.
Most of them are in PyCFunction, method_descriptor, and some
calling APIs already.
PEP 580 just restructure them completely to be reusable from Cython.
So I agree that PEP 580 is better when thinking from Cython's side.
I'm not sure which way we should go yet. But my current idea is:
* Implement PEP 580 as semi-public APIs only for tools like Cython.
* Other Python implementation may not support it in foreseeable future.
So such tools should support legacy implementation too.
* PEP 576 and 580 are not strictly mutually exclusive; PEP 576 may be
accepted in addition to PEP 580, for simpler FASTCALL-able object support.
Especially for extension author prefer C to Cython (including stdlib).
* If this happened, PEP 580 can remove one abstraction; tp_ccalloffset is
offset of PyCCallRoot instead of pointer to it.
Py_TPFLAGS_FUNCTION_DESCRIPTOR will be removed from PEP 576 too.
INADA Naoki <songofacandy(a)gmail.com>
On 2018-07-30 13:11, INADA Naoki wrote:
> Like previous SageMath bench, this is caused by Cython's
> specialization; __Pyx_PyObject_CallOneArg.
> It specializing calling PyFunction and PyCFunction, but it doesn't
> specialized for calling CyFunction.
Yes, I saw that too. But this is exactly what CPython does (it optimizes
PyFunction and PyCFunction but not CyFunction), so I would still argue
that the benchmark is fair.