From michael.klemm at intel.com Sun Sep 2 07:41:14 2018 From: michael.klemm at intel.com (Klemm, Michael) Date: Sun, 2 Sep 2018 11:41:14 +0000 Subject: [Cython] OpenMP 4.5 array reductions In-Reply-To: References: Message-ID: <0DAB4B4FC42EAA41802458ADA9C2F8247E080375@IRSMSX104.ger.corp.intel.com> Hi, You could solve this by specializing the code according to the version of the OpenMP specification supported: #if _OPENMP >= 201511 // OpenMP 4.5 and newer #else // features up to OpenMP 3.1 #endif It?s more work in the compiler, but the auto-generated code does not have to be pretty. ? Kind regards, -michael From: cython-devel [mailto:cython-devel-bounces+michael.klemm=intel.com at python.org] On Behalf Of Nathan Goldbaum Sent: Friday, August 31, 2018 4:51 PM To: Core developer mailing list of the Cython compiler Subject: [Cython] OpenMP 4.5 array reductions Hi all, I'm curious if there would be any interest in adding support for OpenMP 4.5 array reduction in the cython compiler or alternatively detecting these cases and raising a cython compiler error. Currently cython is generating code that will compile but might lead to race conditions. See: https://github.com/cython/cython/issues/2316 https://github.com/cython/cython/issues/1504 The trouble with fixing this in the cython compiler is that adding the appropriate OpenMP pragmas might generate code that will no longer compile on compilers that don't support OpenMP 4.5. However perhaps that's a better alternative than the status quo, which is generating code that might produce random results. I'd very much appreciate any feedback or advice here as this is currently blocking our ability to easily add OpenMP to our cython code in places where we'd like threads to do parallel reductions on large arrays. I would also not be surprised if there is code in the wild that is racy and silently producing incorrect results. -Nathan Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Demeyer at UGent.be Thu Sep 6 16:54:52 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Thu, 6 Sep 2018 22:54:52 +0200 Subject: [Cython] Hooking tp_clear() Message-ID: <5B91941C.1080504@UGent.be> Hello, Cython's __dealloc__ special method is meant to deal with cleaning up instances of cdef classes. However, this hooks tp_dealloc() and does not have meaningful access to Python attributes, since those might have been cleared by tp_clear(). I have a concrete use case where I want something like __dealloc__ but *before* Python attributes are cleared. So this really belongs in tp_clear(). Using a PyObject* attribute in the cdef class with manual reference counting is not a solution since this attribute could genuinely occur in a reference cycle. So I would suggest to support a __clear__ special method, which would then be called both by tp_clear() and tp_dealloc(). It's important to note that this should be idempotent: it will be called at least once before Python attributes are cleared but it may also be called later. PS: I never really understood the technical difference between tp_clear() and tp_dealloc(). It seems to me that these serve a very similar purpose: why can't the garbage collector just call tp_dealloc()? Jeroen. From stefan_ml at behnel.de Fri Sep 7 00:35:07 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 7 Sep 2018 06:35:07 +0200 Subject: [Cython] Hooking tp_clear() In-Reply-To: <5B91941C.1080504@UGent.be> References: <5B91941C.1080504@UGent.be> Message-ID: <7c079bb1-d5c0-271e-7a26-8b18ecb9bcd9@behnel.de> Jeroen Demeyer schrieb am 06.09.2018 um 22:54: > Cython's __dealloc__ special method is meant to deal with cleaning up > instances of cdef classes. However, this hooks tp_dealloc() and does not > have meaningful access to Python attributes, since those might have been > cleared by tp_clear(). > > I have a concrete use case where I want something like __dealloc__ but > *before* Python attributes are cleared. So this really belongs in tp_clear(). > > Using a PyObject* attribute in the cdef class with manual reference > counting is not a solution since this attribute could genuinely occur in a > reference cycle. > > So I would suggest to support a __clear__ special method, which would then > be called both by tp_clear() and tp_dealloc(). It's important to note that > this should be idempotent: it will be called at least once before Python > attributes are cleared but it may also be called later. Maybe you actually want "tp_finalize"? https://www.python.org/dev/peps/pep-0442/ Cython moves "__del__" methods there in Py3.4+. > PS: I never really understood the technical difference between tp_clear() > and tp_dealloc(). It seems to me that these serve a very similar purpose: > why can't the garbage collector just call tp_dealloc()? The problem are reference cycles, in which there definitely is a life reference to the object *somewhere* else. Thus, the GC cannot simply deallocate the object, it must try to delete the references instead. This is what "tp_clear" is used for, it clears all references that an object inside of a reference cycle has towards other objects (or at least those that can participate in that cycle). This will (hopefully) trigger a cascade of deallocations along the cycle. If that isn't enough, and there is still a cycle, then the clearing needs to be repeated until all references to the last object in the cycle are cleared. AFAIR, tp_clear() is *only* called by the cyclic garbage collector and not during normal refcounting deallocation. The GC process is: tp_visit() to detect cycles, tp_clear() to break them. tp_dealloc() is then only called indirectly by the normal refcounting cleanup, not directly by the GC. Stefan From greg.ewing at canterbury.ac.nz Fri Sep 7 01:59:40 2018 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Sep 2018 17:59:40 +1200 Subject: [Cython] Hooking tp_clear() In-Reply-To: <5B91941C.1080504@UGent.be> References: <5B91941C.1080504@UGent.be> Message-ID: <5B9213CC.5080301@canterbury.ac.nz> Jeroen Demeyer wrote: > I have a concrete use case where I want something like __dealloc__ but > *before* Python attributes are cleared. So this really belongs in tp_clear(). Are you sure you can't do it in __del__? From what I gather, the presence of __del__ no longer prevents cyclic garbage collection. > I never really understood the technical difference between > tp_clear() and tp_dealloc(). It seems to me that these serve a very > similar purpose: why can't the garbage collector just call tp_dealloc()? tp_dealloc is the inverse of tp_alloc -- its purpose is to free the memory occupied by the object. This must not be done until there are no more references to the object. tp_clear is used to break reference cycles. After calling it, there may still be references to the object from other objects in the cycle, so tp_dealloc can't be done at that point. -- Greg From J.Demeyer at UGent.be Fri Sep 7 04:14:32 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Fri, 7 Sep 2018 10:14:32 +0200 Subject: [Cython] Hooking tp_clear() In-Reply-To: <3bd0af8922924b0aa442aef92f886644@xmail103.UGent.be> References: <5B91941C.1080504@UGent.be> <3bd0af8922924b0aa442aef92f886644@xmail103.UGent.be> Message-ID: <5B923368.8020605@UGent.be> On 2018-09-07 06:35, Stefan Behnel wrote: > Maybe you actually want "tp_finalize"? > > https://www.python.org/dev/peps/pep-0442/ > > Cython moves "__del__" methods there in Py3.4+. First of all, are you sure? I tried to cythonize cdef class X: def __del__(self): pass but the generated C code has static PyTypeObject __pyx_type_6cytest_X = { /* ... */ #if PY_VERSION_HEX >= 0x030400a1 0, /*tp_finalize*/ #endif }; In any case, I want something which works with Python 2.7 too. Could we somehow call __del__ from tp_clear() in Python 2.7? Jeroen. From stefan_ml at behnel.de Fri Sep 7 12:31:05 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 7 Sep 2018 18:31:05 +0200 Subject: [Cython] Hooking tp_clear() In-Reply-To: <5B923368.8020605@UGent.be> References: <5B91941C.1080504@UGent.be> <3bd0af8922924b0aa442aef92f886644@xmail103.UGent.be> <5B923368.8020605@UGent.be> Message-ID: <3ba71c27-21ed-2b31-3fd2-52ca51a6e591@behnel.de> Jeroen Demeyer schrieb am 07.09.2018 um 10:14: > On 2018-09-07 06:35, Stefan Behnel wrote: >> Maybe you actually want "tp_finalize"? >> >> https://www.python.org/dev/peps/pep-0442/ >> >> Cython moves "__del__" methods there in Py3.4+. > > First of all, are you sure? I tried to cythonize > > cdef class X: > ??? def __del__(self): > ??????? pass > > but the generated C code has > > static PyTypeObject __pyx_type_6cytest_X = { > ? /* ... */ > ? #if PY_VERSION_HEX >= 0x030400a1 > ? 0, /*tp_finalize*/ > ? #endif > }; Interesting. I thought it did, but apparently, it doesn't. "__del__" doesn't currently have any special meaning for extension types. Then let's change that and give it the same meaning that PEP-442 gives it for Python classes. Want to write a PR? > In any case, I want something which works with Python 2.7 too. Could we > somehow call __del__ from tp_clear() in Python 2.7? That would be problematic, because tp_clear() is not guaranteed to be called. Quite the contrary, it's only called if the object is so unlucky to participate in a reference cycle that gets cleaned up. In that case, we would have to call it twice (in tp_clear() and tp_dealloc()) in order to make sure that it always gets called. In Py3.4+, it always gets called separately before tp_clear() or tp_dealloc(). Can't you just wait a year or two ? ? ;) Or, since it's a new feature, we could just document that it gets called twice in Py2 and that people should take care of writing their code accordingly. OTOH, the way PEP 442 guarantees that it only gets called once is by setting a flag in the GC header. We could see if we can steal an unused flag there to do the same thing in Py2. Stefan From stefan_ml at behnel.de Sun Sep 16 11:21:08 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Sep 2018 17:21:08 +0200 Subject: [Cython] Cython 3.0 and "unicode_literals" In-Reply-To: <5B7D44A3.6040404@UGent.be> References: <9f1ab4d53359429596ca9eb7c28e5f08@xmail103.UGent.be> <5B748FB8.4020009@UGent.be> <0bf385ab9c2242ecad5ab2af6c9f778b@xmail103.UGent.be> <5B77D7CD.6000008@UGent.be> <9d39d9ade7c24c008d1a370a01a669ac@xmail103.UGent.be> <5B77EF97.5070109@UGent.be> <65306f30381646b680ff75b862d08345@xmail103.UGent.be> <5B786F52.4020207@UGent.be> <5B7D44A3.6040404@UGent.be> Message-ID: <7f4db7b6-734b-8de0-8e2e-ae716bd5b94e@behnel.de> Jeroen Demeyer schrieb am 22.08.2018 um 13:10: > On 2018-08-19 08:26, Stefan Behnel wrote: >> Should we make that a new directive rather than a language level? Like >> "py2_str=str"? That would allow its use together with language_level=3 >> already in the next release. > > With a new new directive, you also run into compatibility problems. What > should the default be? py2_str=str or py2_str=unicode? The former breaks > code assuming that it's unicode and the latter doesn't really solve > anything: stuff will still break when language_level=3 becomes the default. > > My proposal is a new setting language_level=3str (meaning: everything that > language_level=3 does, except unicode_literals) and make that the default. > That way, you keep full compatibility with code already setting the > language_level. You also have reasonably good chances that code that > currently uses the implicit language_level=2 will continue to work with > language_level=3str. I thought about this some more and I like the idea. There's still the true-division issue, but strings are certainly the biggest blocker in migrations. As long as people want to support Python 2.x, "3str" is a way to help them with it, and for Py3-only users, it won't make a difference. Stefan From stefan_ml at behnel.de Sun Sep 16 11:48:22 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 16 Sep 2018 17:48:22 +0200 Subject: [Cython] Cython 0.29 beta 1 released Message-ID: <14d2b9ca-2233-89b9-8c55-5a67355c65bc@behnel.de> Hi all, after half a year of development, many community pull requests and a lot of feedback and good ideas in online discussions and at conferences, I'm proud to release the first beta of Cython 0.29. This is a major feature release that comes with many great improvements and several important bug fixes. See the long list of changes below. Please give it some testing to help us quickly advance to the final release. Download: https://github.com/cython/cython/releases/tag/0.29b1 Changelog: https://github.com/cython/cython/blob/0dcb5d1930e573caa8494fe838c4c2cd4e2041f2/CHANGES.rst Foresight: Given that Cython has been in critical production use all over the world for several years, but never found the perfect time for a 1.0 version bump, we designate this to become the last 0.x release of the project and decided to skip the 1.0 release which the 0.x series has long represented anyway. Planning has already started [1] for the next major release, titled "3.0". It will finally switch the default Cython language level from Py2 to Py3, to match what users expect from a Python compiler these days without additional options or configuration. Cython 2 code will continue to be supported as before with the directive "language_level=2", although there are ideas how to help with the modernisation. We're happy to hear your feedback. [1] https://github.com/cython/cython/milestone/58 Have fun, Stefan 0.29 beta 1 (2018-09-16) ======================== Features added -------------- * PEP-489 multi-phase module initialisation has been enabled again. Module reloads raise an exception to prevent corruption of the static module state. * A set of ``mypy`` compatible PEP-484 declarations were added for Cython's C data types to integrate with static analysers in typed Python code. They are available in the ``Cython/Shadow.pyi`` module and describe the types in the special ``cython`` module that can be used for typing in Python code. Original patch by Julian Gethmann. (Github issue #1965) * Memoryviews are supported in PEP-489 style type declarations. (Github issue #2529) * Raising exceptions from nogil code will automatically acquire the GIL, instead of requiring an explicit ``with gil`` block. * ``@cython.nogil`` is supported as a C-function decorator in Python code. (Github issue #2557) * ``cython.inline()`` supports a direct ``language_level`` keyword argument that was previously only available via a directive. * In CPython 3.6 and later, looking up globals in the module dict is almost as fast as looking up C globals. (Github issue #2313) * For a Python subclass of an extension type, repeated method calls to non-overridden cpdef methods can avoid the attribute lookup in Py3.6+, which makes them 4x faster. (Github issue #2313) * (In-)equality comparisons of objects to integer literals are faster. (Github issue #2188) * Some internal and 1-argument method calls are faster. * Modules that cimport many external extension types from other Cython modules execute less import requests during module initialisation. * The coverage plugin considers more C file extensions such as ``.cc`` and ``.cxx``. (Github issue #2266) * The ``cythonize`` command accepts compile time variable values (as set by ``DEF``) through the new ``-E`` option. Patch by Jerome Kieffer. (Github issue #2315) * ``pyximport`` can import from namespace packages. Patch by Prakhar Goel. (Github issue #2294) * Some missing numpy and CPython C-API declarations were added. Patch by John Kirkham. (Github issues #2523, #2520, #2537) * Declarations for the ``pylifecycle`` C-API functions were added in a new .pxd file ``cpython.pylifecycle``. * The Pythran support was updated to work with the latest Pythran 0.8.7. Original patch by Adrien Guinet. (Github issue #2600) * ``%a`` is included in the string formatting types that are optimised into f-strings. In this case, it is also automatically mapped to ``%r`` in Python 2.x. * New C macro ``CYTHON_HEX_VERSION`` to access Cython's version in the same style as ``PY_HEX_VERSION``. Bugs fixed ---------- * The exception handling in generators and coroutines under CPython 3.7 was adapted to the newly introduced exception stack. Users of Cython 0.28 who want to support Python 3.7 are encouraged to upgrade to 0.29 to avoid potentially incorrect error reporting and tracebacks. * Crash when importing a module under Stackless Python that was built for CPython. Patch by Anselm Kruis. (Github issue #2534) * 2-value slicing of typed sequences failed if the start or stop index was None. Patch by Christian Gibson. (Github issue #2508) * Multiplied string literals lost their factor when they are part of another constant expression (e.g. 'x' * 10 + 'y' => 'xy'). * String formatting with the '%' operator didn't call the special ``__rmod__()`` method if the right side is a string subclass that implements it. (Python issue 28598) * The directive ``language_level=3`` did not apply to the first token in the source file. (Github issue #2230) * Overriding cpdef methods did not work in Python subclasses with slots. Note that this can have a performance impact on calls from Cython code. (Github issue #1771) * Fix declarations of builtin or C types using strings in pure python mode. (Github issue #2046) * Several internal function signatures were fixed that lead to warnings in gcc-8. (Github issue #2363) * The numpy helper functions ``set_array_base()`` and ``get_array_base()`` were adapted to the current numpy C-API recommendations. Patch by Matti Picus. (Github issue #2528) * Some NumPy related code was updated to avoid deprecated API usage. Original patch by jbrockmendel. (Github issue #2559) * Several C++ STL declarations were extended and corrected. Patch by Valentin Valls. (Github issue #2207) * C lines of the module init function were unconditionally not reported in exception stack traces. Patch by Jeroen Demeyer. (Github issue #2492) * When PEP-489 support is enabled, reloading the module overwrote any static module state. It now raises an exception instead, given that reloading is not actually supported. Other changes ------------- * Cython now emits a warning when no ``language_level`` (2 or 3) is set explicitly, neither as a ``cythonize()`` option nor as a compiler directive. This is meant to prepare the transition of the default language level from currently Py2 to Py3, since that is what most new users will expect these days. The next major release is intended to make that change, so that it will parse all code that does not request a specific language level as Python 3 code. The language level 2 will continue to be supported for an indefinite time. * The documentation was restructured, cleaned up and examples are now tested. The NumPy tutorial was also rewritten to simplify the running example. Contributed by Gabriel de Marmiesse. (Github issue #2245) * Cython compiles less of its own modules at build time to reduce the installed package size to about half of its previous size. This makes the compiler slightly slower, by about 5-7%. From robertwb at gmail.com Mon Sep 17 03:09:06 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 17 Sep 2018 09:09:06 +0200 Subject: [Cython] Cython 3.0 and "unicode_literals" In-Reply-To: <7f4db7b6-734b-8de0-8e2e-ae716bd5b94e@behnel.de> References: <9f1ab4d53359429596ca9eb7c28e5f08@xmail103.UGent.be> <5B748FB8.4020009@UGent.be> <0bf385ab9c2242ecad5ab2af6c9f778b@xmail103.UGent.be> <5B77D7CD.6000008@UGent.be> <9d39d9ade7c24c008d1a370a01a669ac@xmail103.UGent.be> <5B77EF97.5070109@UGent.be> <65306f30381646b680ff75b862d08345@xmail103.UGent.be> <5B786F52.4020207@UGent.be> <5B7D44A3.6040404@UGent.be> <7f4db7b6-734b-8de0-8e2e-ae716bd5b94e@behnel.de> Message-ID: On Sun, Sep 16, 2018 at 5:24 PM Stefan Behnel wrote: > Jeroen Demeyer schrieb am 22.08.2018 um 13:10: > > On 2018-08-19 08:26, Stefan Behnel wrote: > >> Should we make that a new directive rather than a language level? Like > >> "py2_str=str"? That would allow its use together with language_level=3 > >> already in the next release. > > > > With a new new directive, you also run into compatibility problems. What > > should the default be? py2_str=str or py2_str=unicode? The former breaks > > code assuming that it's unicode and the latter doesn't really solve > > anything: stuff will still break when language_level=3 becomes the > default. > > > > My proposal is a new setting language_level=3str (meaning: everything > that > > language_level=3 does, except unicode_literals) and make that the > default. > > That way, you keep full compatibility with code already setting the > > language_level. You also have reasonably good chances that code that > > currently uses the implicit language_level=2 will continue to work with > > language_level=3str. > > I thought about this some more and I like the idea. There's still the > true-division issue, but strings are certainly the biggest blocker in > migrations. As long as people want to support Python 2.x, "3str" is a way > to help them with it, and for Py3-only users, it won't make a difference. > Yes, this does seem worth handling differently. I've also been wondering if the c_string_type directive would work for this. (Unfortunately, we let c_string_type=str mean c_string_type=bytes, where a more natural interpretation for c_string_type=str would be "bytes in py2, unicode in py3." Would it be good enough to disallow str for a release before going to 3.0? (Or would a warning be good enough, given that Cython 3.0 is backwards incompatible?) -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertwb at gmail.com Mon Sep 17 09:44:42 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 17 Sep 2018 15:44:42 +0200 Subject: [Cython] Safer exception handling Message-ID: One of the pain points in Cython is that one must explicitly annotate non-object returning functions with except clauses. Would it be worth trying to change the default here, making exception-suppressing opt-in rather than opt-out? There are a couple of open questions, e.g. * What would the syntax be? * What about extern functions? * What would the performance impact be? Could it be mitigated? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Mon Sep 17 16:07:23 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 17 Sep 2018 22:07:23 +0200 Subject: [Cython] Safer exception handling In-Reply-To: References: Message-ID: <28a35f07-4442-f0fa-e53e-a70cc79594f0@behnel.de> Robert Bradshaw schrieb am 17.09.2018 um 15:44: > One of the pain points in Cython is that one must explicitly annotate > non-object returning functions with except clauses. Would it be worth > trying to change the default here, making exception-suppressing opt-in > rather than opt-out? > > There are a couple of open questions, e.g. > * What would the syntax be? > * What about extern functions? > * What would the performance impact be? Could it be mitigated? Probably a good idea, and worth a 3.0 ticket. Note that the default for return type annotations is to automatically propagate exceptions, in order to keep the drop from Python semantics gentle. @cython.cfunc def func() -> cython.int: ... is read as "except? -1", unless declared otherwise. Stefan From J.Demeyer at UGent.be Tue Sep 18 03:43:30 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Tue, 18 Sep 2018 09:43:30 +0200 Subject: [Cython] Safer exception handling In-Reply-To: <3387185457674aa3b83b4135e4a655a1@xmail103.UGent.be> References: <3387185457674aa3b83b4135e4a655a1@xmail103.UGent.be> Message-ID: <5BA0ACA2.1010607@UGent.be> On 2018-09-17 15:44, Robert Bradshaw wrote: > One of the pain points in Cython is that one must explicitly annotate > non-object returning functions with except clauses. Would it be worth > trying to change the default here, making exception-suppressing opt-in > rather than opt-out? > > There are a couple of open questions, e.g. > * What would the syntax be? > * What about extern functions? > * What would the performance impact be? Could it be mitigated? Fourth pain point: function pointers. Typically, those are used for pure C code where you don't want exception checking. And since this is opened for discussion, allow me to repeat a related request that I made some years ago but which was rejected: have a way to say "this function does not return anything, but it technically returns an int just for exception handling. 0 is returned on success and -1 on error". There are quite a few Python/C API functions like that and I also commonly write such functions myself. Of course you can do cdef int foo(...) except -1 but this is awkward since the "int" is misleading. It is also wrong for cpdef functions where the function should return None but it returns 0 instead. The syntax I suggested at the time was something like cdef nint foo(...) where nint is like int except that it converts to/from None, just like bint converts to/from bool. Jeroen. From J.Demeyer at UGent.be Tue Sep 18 04:12:28 2018 From: J.Demeyer at UGent.be (Jeroen Demeyer) Date: Tue, 18 Sep 2018 10:12:28 +0200 Subject: [Cython] Safer exception handling In-Reply-To: <3387185457674aa3b83b4135e4a655a1@xmail103.UGent.be> References: <3387185457674aa3b83b4135e4a655a1@xmail103.UGent.be> Message-ID: <5BA0B36C.3030200@UGent.be> On 2018-09-17 15:44, Robert Bradshaw wrote: > One of the pain points in Cython is that one must explicitly annotate > non-object returning functions with except clauses. Would it be worth > trying to change the default here, making exception-suppressing opt-in > rather than opt-out? Just to clarify things: are you proposing that the default would be "except *"? An alternative would be to give *warnings* for functions where exceptions could occur but could not be propagated. For example, this function is totally fine: cdef int foo(int x): return x but this function would give a warning: cdef int foo(x): return x # implicit conversion Python -> int and this should probably be a compile-time error: cdef int foo(x): raise NotImplementedError From robertwb at gmail.com Wed Sep 19 01:58:27 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 19 Sep 2018 07:58:27 +0200 Subject: [Cython] Safer exception handling In-Reply-To: <5BA0B36C.3030200@UGent.be> References: <3387185457674aa3b83b4135e4a655a1@xmail103.UGent.be> <5BA0B36C.3030200@UGent.be> Message-ID: On Tue, Sep 18, 2018, 10:12 AM Jeroen Demeyer wrote: > On 2018-09-17 15:44, Robert Bradshaw wrote: > > One of the pain points in Cython is that one must explicitly annotate > > non-object returning functions with except clauses. Would it be worth > > trying to change the default here, making exception-suppressing opt-in > > rather than opt-out? > > Just to clarify things: are you proposing that the default would be > "except *"? > We may choose an implicit default exception value. An alternative would be to give *warnings* for functions where > exceptions could occur but could not be propagated. For example, this > function is totally fine: > > cdef int foo(int x): > return x > > but this function would give a warning: > > cdef int foo(x): > return x # implicit conversion Python -> int > Given that essentially every Python operation can raise exceptions, I don't know how useful this warning would be. Unless we expect everyone to change their code. and this should probably be a compile-time error: > > cdef int foo(x): > raise NotImplementedError > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Wed Sep 19 11:00:33 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 19 Sep 2018 17:00:33 +0200 Subject: [Cython] Safer exception handling In-Reply-To: References: Message-ID: <15f036ae-09d2-2dd8-bee9-05fed3ccb605@behnel.de> Robert Bradshaw schrieb am 17.09.2018 um 15:44: > One of the pain points in Cython is that one must explicitly annotate > non-object returning functions with except clauses. Would it be worth > trying to change the default here, making exception-suppressing opt-in > rather than opt-out? > > There are a couple of open questions, e.g. > * What would the syntax be? > * What about extern functions? > * What would the performance impact be? Could it be mitigated? One more point: what about nogil functions? They can, in theory, raise exceptions, but they almost never do in practice. Given that the default for return type annotations is exception propagation (i.e. Python semantics) now, mixing that into a nogil function means that the caller would almost always check for exceptions needlessly. And we do not currently have a way to say "no exception return value" to get rid of that check. That's something we'd need, especially if we change the default. Stefan From stefan_ml at behnel.de Thu Sep 20 01:55:57 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 20 Sep 2018 07:55:57 +0200 Subject: [Cython] Preparing the language level change - Re: [cython-users] Cython 0.29 beta 1 released In-Reply-To: <5BA0B603.5010501@UGent.be> References: <5BA0B603.5010501@UGent.be> Message-ID: Jeroen Demeyer schrieb am 18.09.2018 um 10:23: > On 2018-09-16 17:48, Stefan Behnel wrote: >> * Cython now emits a warning when no ``language_level`` (2 or 3) is set >> ?? explicitly > > Currently, language_level=3 breaks a lot of code due to unicode_literals. > As I mentioned a few times, it would be good to have a way to specify > "language_level=3 except for unicode_literals". Ideally, this option should > exist before 0.29 is released. I can see that this would be helpful. It's unfortunate, though, that this would introduce a temporary option that complicates the current integer "language_level". Meaning, we'd introduce something now that we then can't get rid of for?ever. And it seems to me like a separate directive wouldn't solve this either because it's not really orthogonal to the language level. Any idea? Stefan From robertwb at gmail.com Thu Sep 20 02:17:13 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 20 Sep 2018 08:17:13 +0200 Subject: [Cython] Safer exception handling In-Reply-To: <15f036ae-09d2-2dd8-bee9-05fed3ccb605@behnel.de> References: <15f036ae-09d2-2dd8-bee9-05fed3ccb605@behnel.de> Message-ID: On Wed, Sep 19, 2018, 5:04 PM Stefan Behnel wrote: > Robert Bradshaw schrieb am 17.09.2018 um 15:44: > > One of the pain points in Cython is that one must explicitly annotate > > non-object returning functions with except clauses. Would it be worth > > trying to change the default here, making exception-suppressing opt-in > > rather than opt-out? > > > > There are a couple of open questions, e.g. > > * What would the syntax be? > > * What about extern functions? > > * What would the performance impact be? Could it be mitigated? > > One more point: what about nogil functions? They can, in theory, raise > exceptions, but they almost never do in practice. > I think feels like it should be orthogonal, I wonder what the overhead of this check really is in practice... Given that the default for return type annotations is exception propagation > (i.e. Python semantics) now, mixing that into a nogil function means that > the caller would almost always check for exceptions needlessly. And we do > not currently have a way to say "no exception return value" to get rid of > that check. That's something we'd need, especially if we change the > default. > Yeah, that's what I meant about the syntax question. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Sep 21 03:38:32 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Sep 2018 09:38:32 +0200 Subject: [Cython] Preparing the language level change - Re: [cython-users] Cython 0.29 beta 1 released In-Reply-To: <5BA36167.4050008@UGent.be> References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> Message-ID: Jeroen Demeyer schrieb am 20.09.2018 um 10:59: > On 2018-09-20 10:15, Jeroen Demeyer wrote: >> On 2018-09-20 07:55, Stefan Behnel wrote: >>> I can see that this would be helpful. It's unfortunate, though, that this >>> would introduce a temporary option that complicates the current integer >>> "language_level". >> >> Does it really have to be an integer? We can make it a string >> internally, but allow it to be set as integer too. That's just a matter >> of putting "language_level = str(language_level)" in the appropriate places. That's pretty much what I meant with "complicates". :) It's very easy to compare "language_level < 3" now. With strings, not so much. In fact, maybe that's not even the way to go. There are two parts of information here, so maybe we should actually split them internally (in "Main.Context.set_language_level() ?) and keep the language_level = 3 but just avoid the "unicode_literals" part. >> Once it's a string, we can add an additional language level "3str". I'm >> willing to make a PR if you agree with this strategy. I think a PR that takes care of splitting the two parts would be worth looking at. Then, make sure we do the right checks for either of them in the right places. The language level isn't always the right indication for specific behaviour. > Further brainstorming: you could also re-purpose language_level as a set of > flags describing various features, like the __future__ flags. I faintly remember proposals for a "from __past__ import ?". One more use case right here. But the problem is that there are really different things to support. One is setting the behaviour globally, e.g. via setup.py. The other is setting it on a file-by-file basis. I expect projects to settle on one way to write code, so a global setting is definitely required, and that's the language level. But then, there are probably cases where this is worth overriding for a single file or two. And the language level doesn't seem right for that. Explicitly opting out of "unicode_literals" would be cleaner. So, maybe just allow "cython: unicode_literals=False" ? Stefan From stefan_ml at behnel.de Mon Sep 24 16:05:37 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Sep 2018 22:05:37 +0200 Subject: [Cython] Preparing the language level change - Re: [cython-users] Cython 0.29 beta 1 released In-Reply-To: <1165e9dc-f654-4c8d-0804-9a4ccf78aa07@behnel.de> References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> <1165e9dc-f654-4c8d-0804-9a4ccf78aa07@behnel.de> Message-ID: <652360e2-a5df-0aaa-547c-f61c19bf61f1@behnel.de> Stefan Behnel schrieb am 21.09.2018 um 17:46: > Robert Bradshaw schrieb am 21.09.2018 um 17:30: >> I agree that this doesn't really feel like a language-level thing. >> >> There seem three desired behaviors here: >> >> language_level=2 where currently "abc" is always a bytes object >> langauge_level=3 where currently "abc" is always a unicode object >> >> and a third option, where "abc" is a str object (depending on the >> runtime). We should support all three of these modes. > > Correction: with "language_level=2", unprefixed string literals are "str", > i.e. "bytes" in Py2 and "unicode" in Py3. "language_level=3" always makes > them unicode strings, and that's what Jeroen was referring to. > > It's really difficult to guess what the most common use case is here. The > only reason why the type changing unprefixed "str" literals are relevant is > that Py2 cannot handle Unicode in some cases (and Unicode strings are > memory hogs in Py2), but Py3 requires Unicode in most cases (and handles it > efficiently). So, the problem here is really Py2, and the problem will go > away as soon as we dump support for it. Until then, however, we're trying > to find a solution to make the language level switch bearable and easy the > transition. I added a new directive "str_is_str=True" which can be combined with "language_level=3" to get the desired behaviour. It keeps the 'str' builtin type as it is (it would otherwise become 'unicode' with level 3) and keeps unprefixed string literals as type 'str' in Py2 and Py3. Everything else should depend solely on the language_level switch. https://github.com/cython/cython/commit/cea42915c5e9ea1da9187aa3c55f3f16d04ba1e3 I think we're now set for the release. I'll prepare a candidate. Stefan From stefan_ml at behnel.de Mon Sep 24 16:28:59 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Sep 2018 22:28:59 +0200 Subject: [Cython] Cython 0.29 rc 1 - Re: Cython 0.29 beta 1 released In-Reply-To: <14d2b9ca-2233-89b9-8c55-5a67355c65bc@behnel.de> References: <14d2b9ca-2233-89b9-8c55-5a67355c65bc@behnel.de> Message-ID: Hi again! Here's a release candidate. Please give it some final testing. https://github.com/cython/cython/tree/0.29rc1 Download: https://github.com/cython/cython/archive/cf7b60ff6814ce8faf7fab2990c423452faa2f0a.zip Updated changelog: https://github.com/cython/cython/blob/0.29rc1/CHANGES.rst Stefan Stefan Behnel schrieb am 16.09.2018 um 17:48: > Hi all, > > after half a year of development, many community pull requests and a lot of > feedback and good ideas in online discussions and at conferences, I'm proud > to release the first beta of Cython 0.29. This is a major feature release > that comes with many great improvements and several important bug fixes. > See the long list of changes below. > > Please give it some testing to help us quickly advance to the final release. > > Download: > https://github.com/cython/cython/releases/tag/0.29b1 > > Changelog: > https://github.com/cython/cython/blob/0dcb5d1930e573caa8494fe838c4c2cd4e2041f2/CHANGES.rst > > > Foresight: > Given that Cython has been in critical production use all over the world > for several years, but never found the perfect time for a 1.0 version bump, > we designate this to become the last 0.x release of the project and decided > to skip the 1.0 release which the 0.x series has long represented anyway. > > Planning has already started [1] for the next major release, titled "3.0". > It will finally switch the default Cython language level from Py2 to Py3, > to match what users expect from a Python compiler these days without > additional options or configuration. Cython 2 code will continue to be > supported as before with the directive "language_level=2", although there > are ideas how to help with the modernisation. We're happy to hear your > feedback. > > [1] https://github.com/cython/cython/milestone/58 > > > Have fun, > > Stefan From daniele at grinta.net Mon Sep 24 18:28:45 2018 From: daniele at grinta.net (Daniele Nicolodi) Date: Mon, 24 Sep 2018 16:28:45 -0600 Subject: [Cython] Preparing the language level change - Re: [cython-users] Cython 0.29 beta 1 released In-Reply-To: <652360e2-a5df-0aaa-547c-f61c19bf61f1@behnel.de> References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> <1165e9dc-f654-4c8d-0804-9a4ccf78aa07@behnel.de> <652360e2-a5df-0aaa-547c-f61c19bf61f1@behnel.de> Message-ID: <3e4d98ed-8f3c-079f-37e6-5e150c405cce@grinta.net> On 24-09-2018 14:05, Stefan Behnel wrote: > I added a new directive "str_is_str=True" which can be combined with > "language_level=3" to get the desired behaviour. It keeps the 'str' builtin > type as it is (it would otherwise become 'unicode' with level 3) and keeps > unprefixed string literals as type 'str' in Py2 and Py3. Everything else > should depend solely on the language_level switch. For consistency with the CPython from __future__ import unicode_literals wouldn't it be better to call this directive "str_literals"? I realize there isn't 100% overlap in the functionality of the two, but I find the "str_is_str" name not very descriptive. Thanks. Cheers, Dan From stefan_ml at behnel.de Tue Sep 25 01:24:48 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 25 Sep 2018 07:24:48 +0200 Subject: [Cython] Preparing the language level change In-Reply-To: <3e4d98ed-8f3c-079f-37e6-5e150c405cce@grinta.net> References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> <1165e9dc-f654-4c8d-0804-9a4ccf78aa07@behnel.de> <652360e2-a5df-0aaa-547c-f61c19bf61f1@behnel.de> <3e4d98ed-8f3c-079f-37e6-5e150c405cce@grinta.net> Message-ID: Daniele Nicolodi schrieb am 25.09.2018 um 00:28: > On 24-09-2018 14:05, Stefan Behnel wrote: >> I added a new directive "str_is_str=True" which can be combined with >> "language_level=3" to get the desired behaviour. It keeps the 'str' builtin >> type as it is (it would otherwise become 'unicode' with level 3) and keeps >> unprefixed string literals as type 'str' in Py2 and Py3. Everything else >> should depend solely on the language_level switch. > > For consistency with the CPython > > from __future__ import unicode_literals > > wouldn't it be better to call this directive "str_literals"? > > I realize there isn't 100% overlap in the functionality of the two, but > I find the "str_is_str" name not very descriptive. I started off with "unicode_literals=False", and then renamed it because this name didn't cover the change of "str" to "unicode" (i.e. renaming the usages of the builtin type internally, so that "str(x)" actually calls "unicode(x)"). Looks like this bikeshed needs painting, so let's have a quick(!) discussion or vote. - should this feature touch the builtin type at all? - more opinions on the name? Stefan From jpe at wingware.com Tue Sep 25 10:27:01 2018 From: jpe at wingware.com (John Ehresman) Date: Tue, 25 Sep 2018 10:27:01 -0400 Subject: [Cython] [cython-users] Re: Preparing the language level change In-Reply-To: References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> <1165e9dc-f654-4c8d-0804-9a4ccf78aa07@behnel.de> <652360e2-a5df-0aaa-547c-f61c19bf61f1@behnel.de> <3e4d98ed-8f3c-079f-37e6-5e150c405cce@grinta.net> Message-ID: <12b66f24-fdd8-4d41-938d-f735f739c49b@wingware.com> On 9/25/18 1:24 AM, Stefan Behnel wrote: > I started off with "unicode_literals=False", and then renamed it because > this name didn't cover the change of "str" to "unicode" (i.e. renaming the > usages of the builtin type internally, so that "str(x)" actually calls > "unicode(x)"). > > Looks like this bikeshed needs painting, so let's have a quick(!) > discussion or vote. > > - should this feature touch the builtin type at all? > > - more opinions on the name? I think the name str_is_str will be confusing to developers who begin with Python 3 because they'll probably assume str is unicode. The name str_is_bytes might be better or maybe unprefixed-string-literals-are-bytes. A bit verbose, but this should be something used with old code that's used with Python 3; newly written code should use b'' literals. I do wonder how usable this will be in practice because passing a bytes instance to something that expects a unicode instance may lead to problems. John From stefan_ml at behnel.de Tue Sep 25 12:33:46 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 25 Sep 2018 18:33:46 +0200 Subject: [Cython] Preparing the language level change In-Reply-To: <12b66f24-fdd8-4d41-938d-f735f739c49b@wingware.com> References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> <1165e9dc-f654-4c8d-0804-9a4ccf78aa07@behnel.de> <652360e2-a5df-0aaa-547c-f61c19bf61f1@behnel.de> <3e4d98ed-8f3c-079f-37e6-5e150c405cce@grinta.net> <12b66f24-fdd8-4d41-938d-f735f739c49b@wingware.com> Message-ID: John Ehresman schrieb am 25.09.2018 um 16:27: > On 9/25/18 1:24 AM, Stefan Behnel wrote: >> I started off with "unicode_literals=False", and then renamed it because >> this name didn't cover the change of "str" to "unicode" (i.e. renaming the >> usages of the builtin type internally, so that "str(x)" actually calls >> "unicode(x)"). >> >> Looks like this bikeshed needs painting, so let's have a quick(!) >> discussion or vote. >> >> - should this feature touch the builtin type at all? >> >> - more opinions on the name? > > I think the name str_is_str will be confusing to developers who begin with > Python 3 because they'll probably assume str is unicode. It is unicode for them, at least in Py3. In Py2, it's str. Thus the name "str_is_str", it's "str" in both versions. Stefan From matti.picus at gmail.com Thu Sep 27 04:38:30 2018 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 27 Sep 2018 11:38:30 +0300 Subject: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties Message-ID: To solve issue #2498, I did some experiments https://github.com/cython/cython/issues/2498#issuecomment-414543549 with hiding direct field access in an external extension type (documented here https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types). The idea is to write `a.ndims` in cython (in plain python code), and in C magically get the attribute lookup converted into a `PyArray_NDIMS(a)` getter, which could be a macro or a c-function. The experiments proved fruitful, and garnered some positive feedback so I am pushing forward. I would like to get some feedback on syntax before I progress too far. Should the syntax be extended to support |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert python __getattr__ access to c functions. int ndims PyArray_NDIMS | or perhaps a decorator, like Python |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert python __getattr__ access to c functions. @property ?cdef int ndims(self): return PyArray_NDIMS(self) or something else? The second seems more wordy but more explicit. I don't know which would be easier to implement or require more effort to test and maintain. Matti | From robertwb at gmail.com Thu Sep 27 15:50:19 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 27 Sep 2018 21:50:19 +0200 Subject: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties In-Reply-To: References: Message-ID: Thanks for looking into this! My preference would be to use the @property syntax, as this will be immediately understandable to any Cython user and could contain arbitrary code, rather than just a macro call. There are, however, a couple of downsides. The first is that it may not be clear when accessing an attribute that a full function call may be invoked. (Arguably this is the same issue one has with Python, but there attribute access is already expensive. The function could be inline as well if desired.) The second is that this means that this attribute is no longer an lvalue. The last is that it's a bit special to be defining methods on an extern class. Maybe it would have to be inline if it's in the pxd? If we're going to be defining a special syntax, I might prefer something like cdef extern class ...: int ndims "PyArray_NDIMS(*)" which more resembles int ndims "nd" Open to bikeshedding on what the "self" placeholder should be. As before, should the ndims lose its lvalue status in this case, or not (in case the accessor is really a macro intended to be used like this)? On Thu, Sep 27, 2018 at 10:38 AM Matti Picus wrote: > To solve issue #2498, I did some experiments > https://github.com/cython/cython/issues/2498#issuecomment-414543549 with > hiding direct field access in an external extension type (documented > here > > https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types). > > The idea is to write `a.ndims` in cython (in plain python code), and in > C magically get the attribute lookup converted into a `PyArray_NDIMS(a)` > getter, which could be a macro or a c-function. > > The experiments proved fruitful, and garnered some positive feedback so > I am pushing forward. > > I would like to get some feedback on syntax before I progress too far. > Should the syntax be extended to support > > |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert > python __getattr__ access to c functions. int ndims PyArray_NDIMS | > > > or perhaps a decorator, like Python > > |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert > python __getattr__ access to c functions. @property cdef int > ndims(self): return PyArray_NDIMS(self) or something else? The second > seems more wordy but more explicit. I don't know which would be easier > to implement or require more effort to test and maintain. Matti | > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > https://mail.python.org/mailman/listinfo/cython-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Sep 27 17:35:35 2018 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 28 Sep 2018 00:35:35 +0300 Subject: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties In-Reply-To: References: Message-ID: <74924a4e-d13a-1e7d-6860-8a7d0ee91b7b@gmail.com> On 27/09/18 22:50, Robert Bradshaw wrote: > > On Thu, Sep 27, 2018 at 10:38 AM Matti Picus > > wrote: > To solve issue #2498, I did some experiments > https://github.com/cython/cython/issues/2498#issuecomment-414543549 > with > hiding direct field access in an external extension type (documented > here > https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types). > > The idea is to write `a.ndims` in cython (in plain python code), > and in > C magically get the attribute lookup converted into a > `PyArray_NDIMS(a)` > getter, which could be a macro or a c-function. > > The experiments proved fruitful, and garnered some positive > feedback so > I am pushing forward. > > I would like to get some feedback on syntax before I progress too > far. > Should the syntax be extended to support > > ctypedef class numpy.ndarray [object PyArrayObject]:cdef: # Convert > python __getattr__ access to c functions. int ndims PyArray_NDIMS | > > > or perhaps a decorator, like Python > > |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert > python __getattr__ access to c functions. @property ?cdef int > ndims(self): return PyArray_NDIMS(self) or something else? The second > seems more wordy but more explicit. I don't know which would be > easier > to implement or require more effort to test and maintain. > > Matti | > > Thanks for looking into this! > > My preference would be to use the?@property syntax, as this will be > immediately understandable to any Cython user and could contain > arbitrary code, rather than just a macro call. > > There are, however, a couple of downsides. The first is that it may > not be clear when accessing an attribute that a full function call may > be invoked. (Arguably this is the same issue one has with Python, but > there attribute access is already expensive. The function could be > inline as well if desired.) The second is that this means that this > attribute is no longer an lvalue. The last is that it's a bit special > to be defining methods on an extern class. Maybe it would have to be > inline if it's in the pxd? > > If we're going to be defining a special syntax, I might prefer > something like > > cdef extern class ...: > ? ? int ndims "PyArray_NDIMS(*)" > > which more resembles > > ? ? int ndims "nd" > > Open to bikeshedding on what the "self" placeholder should be. As > before, should the ndims lose its lvalue status in this case, or not > (in case the accessor is really a macro intended to be used like this)? > > Sorry about the formatting messup, the original proposal was supposed to be (this time using double spacing to make sure it works): ----------------------------------------------------------------------------- cdef extern class ...: ??? @property ??? cdef int ndims(self): ??????? return PyArray_NDIMS(self) ---------------------------------------------------------- vs -------------------------------------------------------- cdef extern class ...: ??? cdef int ndims PyArray_NDIMS -------------------------------------------------------- The proposal? is for a getter via a C function or a macro. NumPy's current public API uses a mix. Currently I am interested in getters that would not allow lvalue at all. Maybe in the future we will have fast rvalue setter functions in NumPy, but the current API does not support them. It remains to be seem how much slowdown we see in real-life benchmarks when calling a small C function from a different shared object to access attributes rather than directly accessing them via struct fields. As I point out in the "experiment" comment referenced above, pandas has code that needs lvalue access to ndarray data, so they would be stuck with the old API which is deprecated but still works for now. Scipy has no such code and oculd move forward to the newer API. As far as bikeshedding the "self" parameter, I would propose doing without, and indeed I successfully hacked Cython to use the second proposal with no self argument and no quotations. Matti From robertwb at gmail.com Thu Sep 27 18:20:51 2018 From: robertwb at gmail.com (Robert Bradshaw) Date: Fri, 28 Sep 2018 00:20:51 +0200 Subject: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties In-Reply-To: <74924a4e-d13a-1e7d-6860-8a7d0ee91b7b@gmail.com> References: <74924a4e-d13a-1e7d-6860-8a7d0ee91b7b@gmail.com> Message-ID: On Thu, Sep 27, 2018 at 11:36 PM Matti Picus wrote: > On 27/09/18 22:50, Robert Bradshaw wrote: > > > > On Thu, Sep 27, 2018 at 10:38 AM Matti Picus > > > wrote: > > To solve issue #2498, I did some experiments > > https://github.com/cython/cython/issues/2498#issuecomment-414543549 > > with > > hiding direct field access in an external extension type (documented > > here > > > https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types > ). > > > > The idea is to write `a.ndims` in cython (in plain python code), > > and in > > C magically get the attribute lookup converted into a > > `PyArray_NDIMS(a)` > > getter, which could be a macro or a c-function. > > > > The experiments proved fruitful, and garnered some positive > > feedback so > > I am pushing forward. > > > > I would like to get some feedback on syntax before I progress too > > far. > > Should the syntax be extended to support > > > > ctypedef class numpy.ndarray [object PyArrayObject]:cdef: # Convert > > python __getattr__ access to c functions. int ndims PyArray_NDIMS | > > > > > > or perhaps a decorator, like Python > > > > |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert > > python __getattr__ access to c functions. @property cdef int > > ndims(self): return PyArray_NDIMS(self) or something else? The second > > seems more wordy but more explicit. I don't know which would be > > easier > > to implement or require more effort to test and maintain. > > > > > Matti | > > > > Thanks for looking into this! > > > > My preference would be to use the @property syntax, as this will be > > immediately understandable to any Cython user and could contain > > arbitrary code, rather than just a macro call. > > > > There are, however, a couple of downsides. The first is that it may > > not be clear when accessing an attribute that a full function call may > > be invoked. (Arguably this is the same issue one has with Python, but > > there attribute access is already expensive. The function could be > > inline as well if desired.) The second is that this means that this > > attribute is no longer an lvalue. The last is that it's a bit special > > to be defining methods on an extern class. Maybe it would have to be > > inline if it's in the pxd? > > > > If we're going to be defining a special syntax, I might prefer > > something like > > > > cdef extern class ...: > > int ndims "PyArray_NDIMS(*)" > > > > which more resembles > > > > int ndims "nd" > > > > Open to bikeshedding on what the "self" placeholder should be. As > > before, should the ndims lose its lvalue status in this case, or not > > (in case the accessor is really a macro intended to be used like this)? > > > > > Sorry about the formatting messup, the original proposal was supposed to > be (this time using double spacing to make sure it works): > > > ----------------------------------------------------------------------------- > > cdef extern class ...: > > @property > > cdef int ndims(self): > > return PyArray_NDIMS(self) > > ---------------------------------------------------------- > > vs > > -------------------------------------------------------- > > cdef extern class ...: > > cdef int ndims PyArray_NDIMS > > -------------------------------------------------------- > > The proposal is for a getter via a C function or a macro. NumPy's > current public API uses a mix. Currently I am interested in getters that > would not allow lvalue at all. Maybe in the future we will have fast > rvalue setter functions in NumPy, but the current API does not support > them. It remains to be seem how much slowdown we see in real-life > benchmarks when calling a small C function from a different shared > object to access attributes rather than directly accessing them via > struct fields. > Hmm...so in this case it upgrading Cython would cause an unconditional switch from direct access to a function call without any code change (or choice) for users of numpy.pxd. I am curious what kind of a slowdown this would represent (though would assume this kind of analysis was done by the NumPy folks when choosing macro vs. function for the public API). As I point out in the "experiment" comment referenced above, pandas has > code that needs lvalue access to ndarray data, so they would be stuck > with the old API which is deprecated but still works for now. Scipy has > no such code and oculd move forward to the newer API. > But if we upgraded Cython, how would they access the old API? I suppose they could create a setter macro of their own to use in the (presumably few) cases where they needed an lvalue. > As far as bikeshedding the "self" parameter, I would propose doing > without, and indeed I successfully hacked Cython to use the second > proposal with no self argument and no quotations. > The problem is that when one reads cdef int aaa bbbb there's no indication as to the meaning of this. We also want to be sure to disallow this syntax everywhere but this one context. On the other hand the quotation syntax cdef int aaa "bbb" already has (widespread) meaning of establishing a C alias of the name in question which is essentially what we're trying to do here. I'm still, however, leaning towards the @property syntax (which we could allow for non-extern cdef classes as well). - Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Fri Sep 28 03:25:39 2018 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 28 Sep 2018 10:25:39 +0300 Subject: [Cython] Mitigating perfomance impact of NumPy API change In-Reply-To: References: <74924a4e-d13a-1e7d-6860-8a7d0ee91b7b@gmail.com> Message-ID: Breaking this into a number of sub-dsicussions, since we seem to be branching. The original topic was Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties On 28/09/18 01:20, Robert Bradshaw wrote: > > Hmm...so in this case it upgrading Cython would cause an unconditional > switch from direct access to a function call without any code change > (or choice) for users of numpy.pxd. I am curious what kind of a > slowdown this would represent (though would assume this kind of > analysis was done by the NumPy folks when choosing macro vs. function > for the public API). > > As I point out in the "experiment" comment referenced above, > pandas has > code that needs lvalue access to ndarray data, so they would be stuck > with the old API which is deprecated but still works for now. > Scipy has > no such code and oculd move forward to the newer API. > > > But if we upgraded Cython, how would they access the old API? I > suppose they could create a setter macro of their own to use in the > (presumably few) cases where they needed an lvalue. > > - Robert > > NumPy changed its recommended API to an opaque one via inline getter functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. I could not find a discussion on performance impact, perhaps since the functions are in the header files and marked inline. Hopefully the compilers will properly deal with making them fast. However, it is true that when people update to a new version of a library things change. In this case, there are backward-compatibility macros that revert the post-1.7 functions into pre-1.7 macros with the same name. Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 api in the pandas build (experimental changeset https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), and was still able to access ndarray.data as a lvalue. Matti From matti.picus at gmail.com Fri Sep 28 04:11:10 2018 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 28 Sep 2018 11:11:10 +0300 Subject: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties In-Reply-To: References: <74924a4e-d13a-1e7d-6860-8a7d0ee91b7b@gmail.com> Message-ID: <11405ff9-4286-35f8-a888-b8278a93dc68@gmail.com> On 28/09/18 01:20, Robert Bradshaw wrote: > On Thu, Sep 27, 2018 at 11:36 PM Matti Picus > wrote: > > The problem is that when one reads > > ? ? cdef int aaa bbbb > > there's no indication as to the meaning of this. We also want to be > sure to disallow this syntax everywhere but this one context. On the > other hand the quotation syntax > > ? ? cdef int aaa "bbb" > > already has (widespread) meaning of establishing a C alias of the name > in question which is essentially what we're trying to do here. > > I'm still, however, leaning towards the?@property syntax (which we > could allow for non-extern cdef classes as well). > > - Robert > > Using "PyArray_DIMS" with quotes but without parentheses would indeed be confusing to users and difficult to implement, so "PyArray_DIMS(*)" where the * is TBD seem nicer. It sounds like the jury is still out. In order to compare the solutions, I will move forward with the @property decorator syntax, but to keep it simple I will start small: only getters and specifically for CFuncDefNodes. Then if you still want to look at the other option I will turn my "experiment into a PR. Matti From matti.picus at gmail.com Fri Sep 28 04:25:50 2018 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 28 Sep 2018 11:25:50 +0300 Subject: [Cython] Mitigating perfomance impact of NumPy API change In-Reply-To: References: <74924a4e-d13a-1e7d-6860-8a7d0ee91b7b@gmail.com> Message-ID: <085b51e2-6cec-1ca8-65e4-d362d7889d7b@gmail.com> On 28/09/18 10:25, Matti Picus wrote: > Breaking this into a number of sub-dsicussions, since we seem to be > branching. The original topic was > > Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter > properties > > On 28/09/18 01:20, Robert Bradshaw wrote: >> >> Hmm...so in this case it upgrading Cython would cause an >> unconditional switch from direct access to a function call without >> any code change (or choice) for users of numpy.pxd. I am curious what >> kind of a slowdown this would represent (though would assume this >> kind of analysis was done by the NumPy folks when choosing macro vs. >> function for the public API). >> >> ??? As I point out in the "experiment" comment referenced above, >> ??? pandas has >> ??? code that needs lvalue access to ndarray data, so they would be >> stuck >> ??? with the old API which is deprecated but still works for now. >> ??? Scipy has >> ??? no such code and oculd move forward to the newer API. >> >> >> But if we upgraded Cython, how would they access the old API? I >> suppose they could create a setter macro of their own to use in the >> (presumably few) cases where they needed an lvalue. >> >> - Robert >> >> > > NumPy changed its recommended API to an opaque one via inline getter > functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. > I could not find a discussion on performance impact, perhaps since the > functions are in the header files and marked inline. Hopefully the > compilers will properly deal with making them fast. However, it is > true that when people update to a new version of a library things > change. In this case, there are backward-compatibility macros that > revert the post-1.7 functions into pre-1.7 macros with the same name. > > Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 > api in the pandas build (experimental changeset > https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), > and was still able to access ndarray.data as a lvalue. > > Matti This means cython/numpy could provide an integration path based on numpy starting to ship its own numpy.pxd: - Cython would define the macro (if not already defined) to use the pre-1.7 Numpy API in the numpy.pxd it ships. This would still work (lvalues would be allowed) after direct access is replaced with the getter properties, since they are macros - NumPy would define the macro to use post-1.7 API (if not already defined) in the numpy.pxd it ships, which as I understand would take precedence over cython's. Then projects like pandas could freely upgrade Cython without changing their codebase, but would encounter errors when updating NumPy. Matti From stefan_ml at behnel.de Sat Sep 29 02:55:19 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 29 Sep 2018 08:55:19 +0200 Subject: [Cython] Preparing the language level change In-Reply-To: References: <5BA0B603.5010501@UGent.be> <5BA36167.4050008@UGent.be> Message-ID: <7ec7822c-e828-6829-9603-c47f3f8bcc6d@behnel.de> Stefan Behnel schrieb am 21.09.2018 um 09:38: > There are two parts of > information here, so maybe we should actually split them internally (in > "Main.Context.set_language_level() ?) and keep the language_level = 3 but > just avoid the "unicode_literals" part. I thought about this some more. The real question is: what should the world be like once Cython 3.0 is out? What we want to do for Cy3 is to change the default behaviour, which mostly impacts Py2/3 ported code. That's why we now warn about a missing "language_level" switch. When people respond to that and set the language level explicitly, that makes them opt out of the default change. Perfect so far. Do we then still want to have a separate option floating around that says "but I want str" ? I don't think so, because that will be the default in Cy3 anyway. Thus, what I think we want is that people either specify the language level explicitly, or get the new default. And to get the new default *now*, in a future proof way, I think the best option is to set an explicit language level, not a separate directive. Thus, I now agree with Jeroen's early intuition that a new language level switch is the right interface. I'll change the implementation to do what I wrote in the quoted paragraph above and push a new RC. Stefan From stefan_ml at behnel.de Sat Sep 29 09:09:33 2018 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 29 Sep 2018 15:09:33 +0200 Subject: [Cython] Cython 0.29 final release candidate In-Reply-To: <14d2b9ca-2233-89b9-8c55-5a67355c65bc@behnel.de> References: <14d2b9ca-2233-89b9-8c55-5a67355c65bc@behnel.de> Message-ID: Hi again! Here's a second and final release candidate. https://github.com/cython/cython/releases/tag/0.29rc2 We extended the "language_level" directive with a new "3str" option and removed the previously added "str_is_str" directive, which had the same intention but lead mostly to confusion in the last RC. This RC also contains a fix for another NumPy related warning about mismatching object sizes. I'll collect more feedback for the next two weeks, and then push the final release. Download: https://github.com/cython/cython/archive/0.29rc2.zip https://github.com/cython/cython/archive/0.29rc2.tar.gz Updated changelog: https://github.com/cython/cython/blob/0.29rc2/CHANGES.rst Stefan Stefan Behnel schrieb am 16.09.2018 um 17:48: > Hi all, > > after half a year of development, many community pull requests and a lot of > feedback and good ideas in online discussions and at conferences, I'm proud > to release the first beta of Cython 0.29. This is a major feature release > that comes with many great improvements and several important bug fixes. > See the long list of changes below. > > Please give it some testing to help us quickly advance to the final release. > > Download: > https://github.com/cython/cython/releases/tag/0.29b1 > > Changelog: > https://github.com/cython/cython/blob/0dcb5d1930e573caa8494fe838c4c2cd4e2041f2/CHANGES.rst > > > Foresight: > Given that Cython has been in critical production use all over the world > for several years, but never found the perfect time for a 1.0 version bump, > we designate this to become the last 0.x release of the project and decided > to skip the 1.0 release which the 0.x series has long represented anyway. > > Planning has already started [1] for the next major release, titled "3.0". > It will finally switch the default Cython language level from Py2 to Py3, > to match what users expect from a Python compiler these days without > additional options or configuration. Cython 2 code will continue to be > supported as before with the directive "language_level=2", although there > are ideas how to help with the modernisation. We're happy to hear your > feedback. > > [1] https://github.com/cython/cython/milestone/58 > > > Have fun, > > Stefan