Mailman 3 July 2018 - Python-Dev

Re: [Python-Dev] [PEP 576/580] Comparing PEP 576 and 580
by Jeroen Demeyer July 31, 2018

July 31, 2018

On 2018-07-31 11:12, INADA Naoki wrote: > For me, this is the most important benefit of PEP 580. I can't split > it from PEP 580. I want PEP 580 to stand by itself. And you say that it is already complicated enough, so we should not mix native C calling into it. PEP 580 is written to allow future extensions like that, but it should be reviewed without those future extensions. Jeroen.

1 0

Re: [Python-Dev] [PEP 576/580] Comparing PEP 576 and 580
by Jeroen Demeyer July 31, 2018

July 31, 2018

On 2018-07-31 09:36, INADA Naoki wrote: > I want to see PoC of direct C calling. To be honest, there is no implementation plan for this yet. I know that several people want this feature, so it makes sense to think about it. For me personally, the main open problem is how to deal with arguments which may be passed both as Python object or as native C type. For example, when doing a function call like f(1,2,3), it may happen that the first argument is really a Python object (so it should be passed as Python int) but that the other two arguments are C integers. > And I think PoC can be implemented without waiting PEP 580. For one particular class (say CyFunction), yes. But this feature would be particularly useful for calling between different kinds of C code, for example between Numba and CPython built-ins, or between Pythran and Cython, ... That is why I think it should be implemented as an extension of PEP 580. Anyway, this is a different subject that we should not mix in the discussion about PEP 580 (that is also why I am replying to this specific point separately). Jeroen.

2 1

Re: [Python-Dev] [PEP 576/580] Comparing PEP 576 and 580
by INADA Naoki July 31, 2018

July 31, 2018

First of all, I'm sorry to I forgot change my mail title. (I though about reserving one more slot for Cython for further Cython-to-Cython call optimization, but I rejected my idea because I'm not sure it really help Cython.) On Mon, Jul 30, 2018 at 11:55 PM Jeroen Demeyer <J.Demeyer(a)ugent.be> wrote: > > On 2018-07-30 15:35, INADA Naoki wrote: > > As repeatedly said, PEP 580 is very complicated protocol > > when just implementing callable object. > > Can you be more concrete what you find complicated? Maybe I can improve > the PEP to explain it more. Also, I'm open to suggestions to make it > less complicated. When thinking from extension writer's point of view, almost all of PEP 580 is complicated comparing PEP 576. Remember they don't need custom method/function type. So PEP 576/580 are needed only when implementing callable object, like itemgetter or lru_cache in stdlib. * We continue to use PyMethodDef and METH_* when writing tp_methods. They should learn PyCCallDef and CCALL_* flags in addition to PyMethodDef and METH_*. * In PEP 576, just put function pointer to type slot. On the other hand, when implementing callable object with PEP 580, (1) Put PyCCallDef somewhere, (2) Put CCallRoot in instance, (3) put offset of (2) to tp_ccall. * Difference between cc_parent and cc_self are unclear too. I think PEP 580 is understandable only for people who tried to implement method objects. It's complete rewrite of PyCFunction and method_descriptor. But extension author can write extension without knowing implementation of them. > > > It is optimized for implementing custom method object, although > > almost only Cython want the custom method type. > > For the record, Numba also seems interested in the PEP: > https://groups.google.com/a/continuum.io/d/msg/numba-users/2G6k2R92MIM/P-cF… > OK, Numba developer interested in: * Supporting FASTCALL for Dispatcher type: PEP 576 is more simple for it as I described above. * Direct C function calling (skip PyObject calling abstraction). While it's not part of PEP 580, it's strong motivation for PEP 580. I want to see PoC of direct C calling. And I think PoC can be implemented without waiting PEP 580. * Cython can have specialization for CyFunction, like it have for CFunction. (Note that Cython doesn't utilize LOAD_METHOD / CALL_METHOD for CFunction too. So lacking support for LOAD_METHOD / CALL_METHOD is not a big problem for now.) * Cython can implement own C signature and embed it in CyFunction. After that, we (including Numba, Cython, and PyPy developers) can discuss how portable C signature can be embedded in PyCCallDef. > > I'm not sure adding such complicated protocol almost only for Cython. > > If CyFunction can be implemented behind PEP 576, it may be better. > > I recall my post > https://mail.python.org/pipermail/python-dev/2018-July/154238.html > explaining the main difference between PEP 576 and PEP 580. I wrote my mail after reading the mail, of course. But it was unclear without reading PEP and implementation carefully. For example, "hook which part" seems meta-discussion to me before reading your implementation. I think only way to understand PEP 580 is reading implementation and imagine how Cython and Numba use it. > > I would like to stress that PEP 580 was designed for maximum > performance, both today and for future extensions (such as calling with > native C types). > I don't know what the word *stress* mean here. (Sorry, I'm not good at English enough for such hard discussion). But I want to see PoC of real benefit of PEP 580, as I said above. > > * PEP 576 and 580 are not strictly mutually exclusive; PEP 576 may be > > accepted in addition to PEP 580 > > I don't think that this is a good idea: you will mostly end up with the > disadvantages of both approaches. > Hm, My point was providing easy and simple way to support FASTCALL in callable object like functools.partial or functools.lru_cache. But it should be discussed after PEP 580. -- INADA Naoki <songofacandy(a)gmail.com>

1 0

USE_STACKCHECK and running out of stack
by Ronald Oussoren July 31, 2018

July 31, 2018

Hi, I’m looking at PyOS_CheckStack because this feature might be useful on macOS (and when I created bpo-33955 for this someone ran with it and created a patch). Does anyone remember why the interpreter raises MemoryError and not RecursionError when PyOS_CheckStack detects that we’re about to run out of stack space? The reason I’m looking into this is that the default stack size on macOS is fairly small and I’d like to avoid crashing the interpreter when running out of stackspace on threads created by the system (this is less of a risk on threads created by Python itself because we can arrange for a large enough stack in that case). Ronald

2 1

Testing C API
by Serhiy Storchaka July 30, 2018

July 30, 2018

Currently C API is not completely covered by tests. Tests for particular parts of C API are scattered through different files. There are files completely purposed for testing C API (like test_capi.py, test_getargs2.py), there are classes (usually having "CAPI" in the name) in different files for testing C API specific for unicode, marshal. Argument parsing tests are split between two files, test_capi.py, test_getargs2.py. I need to add new tests for new features, and I'm going to add new tests for existing C API. But first I'm going to reorganize tests. Add a new directory Lib/test/test_capi, and move all C API tests into it, grouped by function prefixes. test_getargs.py for testing PyArg_*(), test_unicode.py for testing PyUnicode_*(), etc. Tests that use the _testcapi module, but don't test specific C API, will left on place. The benefit is that it will be easier to run all C API tests at once, and only them, and it will be clearer what C API is covered by tests. The disadvantage is that you will need to run several files for testing marshal for example. What are your thoughts?

6 10

Tests failing on Windows with TESTFN
by Tim Golden July 30, 2018

July 30, 2018

I'm just easing back into core development work by trying to get a stable testing environment for Python development on Windows. One problem is that certain tests use support.TESTFN (a local directory constructed from the pid) for output files etc. However this can cause issues on Windows when recreating the folders / files for multiple tests, especially when running in parallel. Here's an example on my laptop deliberately running 3 tests with -j0 which I know will generate an error about one time in three: C:\work-in-progress\cpython>python -mtest -j0 test_urllib2 test_bz2 test_importlib Running Debug|Win32 interpreter... Run tests in parallel using 6 child processes 0:00:23 [1/3/1] test_urllib2 failed test test_urllib2 failed -- Traceback (most recent call last): File "C:\work-in-progress\cpython\lib\test\test_urllib2.py", line 821, in test_file f = open(TESTFN, "wb") PermissionError: [Errno 13] Permission denied: '@test_15564_tmp' Although these errors are both intermittent and fairly easily spotted, the effect is that I rarely get a clean test run when I'm applying a patch. I started to address this years ago but things stalled. I'm happy to pick this up again and have another go, but I wanted to ask first whether there was any objection to my converting tests to using tempfile functions which should avoid the problem? TJG

9 30

Error message for wrong number of arguments
by Jeroen Demeyer July 30, 2018

July 30, 2018

Hello, I noticed an inconsistency in the error messages for the number of arguments to a method call. For Python methods, the "self" argument is counted. For built-in methods, the "self" argument is *not* counted: >>> class mylist(list): ... def append(self, val): super().append(val) >>> f = list().append >>> f(1,2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: append() takes exactly one argument (2 given) >>> g = mylist().append >>> g(1,2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: append() takes 2 positional arguments but 3 were given I think it has been argued before that it's a feature that self is counted. So I consider the error message for list().append a bug. This is one of the many oddities I noticed while working on improving built-in functions. Would you agree to change the error message for built-in methods to be closer to Python methods? Jeroen.

3 2

Re: [Python-Dev] [PEP 576/580] Reserve one type slot for Cython
by Jeroen Demeyer July 30, 2018

July 30, 2018

On 2018-07-30 15:35, INADA Naoki wrote: > As repeatedly said, PEP 580 is very complicated protocol > when just implementing callable object. Can you be more concrete what you find complicated? Maybe I can improve the PEP to explain it more. Also, I'm open to suggestions to make it less complicated. > It is optimized for implementing custom method object, although > almost only Cython want the custom method type. For the record, Numba also seems interested in the PEP: https://groups.google.com/a/continuum.io/d/msg/numba-users/2G6k2R92MIM/P-cF… > I'm not sure adding such complicated protocol almost only for Cython. > If CyFunction can be implemented behind PEP 576, it may be better. I recall my post https://mail.python.org/pipermail/python-dev/2018-July/154238.html explaining the main difference between PEP 576 and PEP 580. I would like to stress that PEP 580 was designed for maximum performance, both today and for future extensions (such as calling with native C types). > * PEP 576 and 580 are not strictly mutually exclusive; PEP 576 may be > accepted in addition to PEP 580 I don't think that this is a good idea: you will mostly end up with the disadvantages of both approaches. Jeroen.

2 1

Fuzzing the Python standard library
by Jussi Judin July 30, 2018

July 30, 2018

Hi, I have been fuzzing[1] various parts of Python standard library for Python 3.7 with python-afl[2] to find out internal implementation issues that exist in the library. What I have been looking for are mainly following: * Exceptions that are something else than the documented ones. These usually indicate an internal implementation issue. For example one would not expect an UnicodeDecodeError from netrc.netrc() function when the documentation[3] promises netrc.NetrcParseError and there is no way to pass properly sanitized file object to the netrc.netrc(). * Differences between values returned by C and Python versions of some functions. quopri module may have these. * Unexpected performance and memory allocation issues. These can be somewhat controversial to fix, if at all, but at least in some cases from end-user perspective it can be really nasty if for example fractions.Fraction("1.64E6646466664") results in hundreds of megabytes of memory allocated and takes very long to calculate. I gave up waiting for that function call to finish after 5 minutes. As this is going to result in a decent amount of bug reports (currently I only filed one[4], although that audio processing area has much more issues to file), I would like to ask your opinion on filing these bug reports. Should I report all issues regarding some specific module in one bug report, or try to further split them into more fine grained reports that may be related? These different types of errors are specifically noticeable in zipfile module that includes a lot of different exception and behavioral types on invalid data <https://github.com/Barro/python-stdlib-fuzzers/tree/master/zipfile/crashes> . And in case of sndhdr module, there are multiple modules with issues (aifc, sunau, wave) that then show up also in sndhdr when they are used. Or are some of you willing to go through the crashes that pop up and help with the report filing? The code and more verbose description for this is available from <https://github.com/Barro/python-stdlib-fuzzers>. It works by default on some GNU/Linux systems only (I use Debian testing), as it relies on /dev/shm/ being available and uses shell scripts as wrappers that rely on various tools that may not be installed on all systems by default. As a bonus, as this uses coverage based fuzzing, it also opens up the possibility of automatically creating a regression test suite for each of the fuzzed modules to ensure that the existing functionality (input files under <fuzz-target>/corpus/ directory) does not suddenly result in additional exceptions and that it is more easy to test potential bug fixes (crash inducing files under <fuzz-target>/crashes/ directory). As a downside, this uses two quite specific tools (afl, python-afl) that have further dependencies (Cython) inside them, I doubt the viability of integrating this type of testing as part of normal Python verification process. As a difference to libFuzzer based fuzzing that is already integrated in Python[5], this instruments the actual (and only the) Python code and not the actions that the interpreter does in the background. So this should result in better fuzzer coverage for Python code that is used with the downside that when C functions are called, they are complete black boxes to the fuzzer. I have mainly run these fuzzer instances at most for several hours per module with 4 instances and stopped running no-issue modules after there have been no new coverage discovered after more than 10 minutes. Also I have not really created high quality initial input files, so I wouldn't be surprised if there are more issues lurking around that could be found with throwing more CPU and higher quality fuzzers at the problem. [1]: https://en.wikipedia.org/wiki/Fuzzing [2]: https://github.com/jwilk/python-afl [3]: https://docs.python.org/3/library/netrc.html [4]: https://bugs.python.org/issue34088 [5]: https://github.com/python/cpython/tree/3.7/Modules/_xxtestfuzz -- Jussi Judin https://jjudin.iki.fi/

9 9

[PEP 576/580] Reserve one type slot for Cython
by INADA Naoki July 30, 2018

July 30, 2018

As repeatedly said, PEP 580 is very complicated protocol when just implementing callable object. It is optimized for implementing custom method object, although almost only Cython want the custom method type. I'm not sure adding such complicated protocol almost only for Cython. If CyFunction can be implemented behind PEP 576, it may be better. On the other hand, most complexity of PEP 580 is not newly added. Most of them are in PyCFunction, method_descriptor, and some calling APIs already. PEP 580 just restructure them completely to be reusable from Cython. So I agree that PEP 580 is better when thinking from Cython's side. --- I'm not sure which way we should go yet. But my current idea is: * Implement PEP 580 as semi-public APIs only for tools like Cython. * Other Python implementation may not support it in foreseeable future. So such tools should support legacy implementation too. * PEP 576 and 580 are not strictly mutually exclusive; PEP 576 may be accepted in addition to PEP 580, for simpler FASTCALL-able object support. Especially for extension author prefer C to Cython (including stdlib). * If this happened, PEP 580 can remove one abstraction; tp_ccalloffset is offset of PyCCallRoot instead of pointer to it. Py_TPFLAGS_FUNCTION_DESCRIPTOR will be removed from PEP 576 too. Regards, -- INADA Naoki <songofacandy(a)gmail.com>

1 0