Re: [Python-Dev] Need discussion for a PR about memory and > objects

[ munch background ]
"Handle" has been used since the 1980s among Macintosh and Win32 programmers as "unique identifier of some object but isn't the memory address". The usage within those APIs seems to match what's being proposed for the new Python C API, in that programmers used functions to ask "what type are you?" "what value do you have?" but couldn't, or at least shouldn't, rely on actual memory layout. I suggest that for the language reference, use the license plate or registration analogy to introduce "handle" and after that use handle throughout. It's short, distinctive, and either will match up with what the programmer already knows or won't clash if or when they encounter handles elsewhere. -- cheers, Hugh Fisher

Hi Hugo, hi all, On Sun, 18 Nov 2018 at 22:53, Hugh Fisher <hugo.fisher@gmail.com> wrote:
FWIW, a "handle" is typically something that users of an API store and pass around, and which can be used to do all operations on some object. It is whatever a specific implementation needs to describe references to an object. In the CPython C API, this is ``PyObject*``. I think that using "handle" for something more abstract is just going to create confusion. Also FWIW, my own 2 cents on the topic of changing the C API: let's entirely drop ``PyObject *`` and instead use more opaque handles---like a ``PyHandle`` that is defined as a pointer-sized C type but is not actually directly a pointer. The main difference this would make is that the user of the API cannot dereference anything from the opaque handle, nor directly compare handles with each other to learn about object identity. They would work exactly like Windows handles or POSIX file descriptors. These handles would be returned by C API calls, and would need to be closed when no longer used. Several different handles may refer to the same object, which stays alive for at least as long as there are open handles to it. Doing it this way would untangle the notion of objects from their actual implementation. In CPython objects would internally use reference counting, a handle is really just a PyObject pointer in disguise, and closing a handle decreases the reference counter. In PyPy we'd have a global table of "open objects", and a handle would be an index in that table; closing a handle means writing NULL into that table entry. No emulated reference counting needed: we simply use the existing GC to keep alive objects that are referenced from one or more table entries. The cost is limited to a single indirection. The C API would change a lot, so it's not reasonable to do that in the CPython repo. But it could be a third-party project, attempting to define an API like this and implement it well on top of both CPython and PyPy. IMHO this might be a better idea than just changing the API of functions defined long ago to make them more regular (e.g. stop returning borrowed references); by now this would mostly mean creating more work for the PyPy team to track and adapt to the changes, with no real benefits. A bientôt, Armin.

On Fri, 23 Nov 2018 at 23:24, Armin Rigo <armin.rigo@gmail.com> wrote (regarding opaque "handles" in the C API):
And the nice thing about doing it as a shim is that it can be applied to *existing* versions of CPython, rather than having to wait for new ones. Node.js started switching over to doing things this way last year, and its a good way to go about it: https://medium.com/the-node-js-collection/n-api-next-generation-node-js-apis... While this would still be a difficult project to pursue, and would suffer from many of the same barriers to adoption as CPython's native stable ABI, it does offer a concrete benefit to 3rd party module authors: being able to create single wheel files that can be shared across multiple Python versions, rather than needing to be built separately for each one. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi Stefan, On Sat, 24 Nov 2018 at 22:17, Stefan Behnel <stefan_ml@behnel.de> wrote:
Couldn't this also be achieved via reference counting? Count only in C space, and delete the "open object" when the refcount goes to 0?
The point is to remove the need to return the same handle to C code if the object is the same one. This saves one of the largest costs of the C API emulation, which is looking up the object in a big dictionary to know if there is already a ``PyObject *`` that corresponds to it or not---for *all* objects that go from Python to C. Once we do that, then there is no need for a refcount any more. Yes, you could add your custom refcount code in C, but in practice it is rarely done. For example, with POSIX file descriptors, when you would need to "incref" a file descriptor, you instead use dup(). This gives you a different file descriptor which can be closed independently of the original one, but they both refer to the same file. A bientôt, Armin.

Hi Armin, Armin Rigo schrieb am 25.11.18 um 06:15:
Ok, got it. And since the handle is a simple integer, there's also no additional cost for memory allocation on the way out.
Ok, then an INCREF() would be replaced by such a dup() call that creates and returns a new handle. In CPython, it would just INCREF and return the PyObject*, which is as fast as the current Py_INCREF(). For PyPy, however, that means that increfs become more costly. One of the outcomes of a recent experiment with tagged pointers for integers was that they make increfs and decrefs more expensive, and (IIUC) that reduced the overall performance quite visibly. In the case of pointers, it's literally just adding a tiny condition that makes this so much slower. In the case of handles, it would add a lookup and a reference copy in the handles array. That's way more costly already than just the simple condition. Now, it's unclear if this performance degredation is specific to CPython (where PyObject* is native), or if it would also apply to PyPy. But I guess the only way to find this out would be to try it. IIUC, the only thing that is needed is to replace Py_INCREF(obj); with obj = Py_NEWREF(obj); which CPython would implement as #define Py_NEWREF(obj) (Py_INCREF(obj), obj) Py_DECREF() would then just invalidate and clean up the handle under the hood. There are probably some places in user code where this would end up leaking a reference by accident because of unclean reference handling (it could overwrite the old handle in the case of a temporary INCREF/DECREF cycle), but it might still be enough for trying it out. We could definitely switch to this pattern in Cython (in fact, we already use such a NEWREF macro in a couple of places, since it's a common pattern). Overall, this seems like something that PyPy could try out as an experiment, by just taking a simple extension module and replacing all increfs with newref assignments. And obviously implementing the whole thing for the C-API, but IIUC, you might be able to tweak that into your cpyext wrapping layer somehow, without manually rewriting all C-API functions? Stefan

Hi, On Sun, 25 Nov 2018 at 10:15, Stefan Behnel <stefan_ml@behnel.de> wrote:
Just to be clear, I suggested making a new API, not just tweaking Py_INCREF() and hoping that all the rest works as it is. I'm skeptical about that. To start with, a ``Py_NEWREF()`` like you describe *will* lead people just renaming all ``Py_INCREF()`` to ``Py_NEWREF()`` ignoring the return value, because that's the easiest change and it would work fine on CPython. A bientôt, Armin.

Armin Rigo schrieb am 26.11.18 um 06:37:
Oh, I'm not skeptical at all. I'm actually sure that it's not that easy. I would guess that such an automatic transformation should work in something like 70% of the cases. Another 25% should be trivial to fix manually, and the remaining 5% … well. They can probably still be changed with some thinking and refactoring. That also involves cases where pointer equality is used to detect object identity. Having a macro for that might be a good idea. Overall, relatively easy. And therefore not unlikely to happen. The lower the bar, the more likely we will see adoption. Also note that explicit Py_INCREF() calls are actually not that common. I just checked and found only 465 calls in 124K lines of Cython generated C code for Cython itself, and 725 calls in 348K C lines of lxml. Not exactly a snap, but definitely not huge. All other objects originate from the C-API in one way or another, which you control.
First of all, as long as Py_INCREF() is not going away, they probably won't change anything. Therefore, before we discuss how laziness will hinder the adoption, I would rather like to see an actual motivation for them to do it. And since this change seems to have zero advantages in CPython, but adds a tiny bit of complexity, I think it's now up to PyPy to show that this added complexity has an advantage that is large enough to motivates it. If you could come up with a prototype that demonstrates the advantage (or at least uncovers the problems we'd face), we could actually discuss about real solutions rather than uncertain ideas. Stefan

On Fri, Nov 23, 2018 at 2:22 PM Armin Rigo <armin.rigo@gmail.com> wrote:
+1 As another point of reference, if you're interested, I've been working lately on the special purpose computer algebra system GAP. It also uses an approach like this: Objects are referenced throughout via an opaque "Obj" type (which is really just a typedef of "Bag", the internal storage reference handle of its "GASMAN" garbage collector [1]). A nice benefit of this, along with the others discussed above, is that it has being relatively easy to replace the garbage collector in GAP--there are options for it to use Boehm-GC, as well as Julia's GC. GAP has its own problems, but it's relatively simple and has been inspiring to look at; I was coincidentally wondering just recently if there's anything Python could take from it (conversely, I'm trying to bring some things I've learned from Python to improve GAP...). [1] https://github.com/gap-system/gap/blob/master/src/gasman.c

On 11/23/18 5:15 AM, Armin Rigo wrote:
Why would this be better than simply returning the pointer? Sure, it prevents ever dereferencing the pointer and messing with the object, it is true. So naughty people would be prevented from messing with the object directly instead of using the API as they should. But my understanding is that the implementation would be slightly slower--there'd be all that looking up objects based on handles, and managing the handle namespace too. I'm not convinced the nice-to-have of "you can't dereference the pointer anymore" is worth this runtime overhead. Or maybe you have something pretty cheap in mind, e.g. "handle = pointer ^ 49"? Or even "handle = pointer ^ (random odd number picked at startup)" to punish the extra-naughty? //arry/

On 11/26/2018 7:08 PM, Larry Hastings wrote:
I thought the important part of the proposal was to have multiple PyHandles that point to the same PyObject (you couldn't "directly compare handles with each other to learn about object identity"). But I'll admit I'm not sure why this would be a win. Then of course they couldn't be regular pointers. Eric

On Mon, Nov 26, 2018 at 6:12 PM Eric V. Smith <eric@trueblade.com> wrote:
Whenever PyPy passes an object from PyPy -> C, then it has to invent a "PyObject*" to represent the PyPy object. 0.1% of the time, the C code will use C pointer comparison to implement an "is" check on this PyObject*. But PyPy doesn't know which 0.1% of the time this will happen, so 100% of the time an object goes from PyPy -> C, PyPy has to check and update some global intern table to figure out whether this particular object has ever made the transition before and use the same PyObject*. 99.9% of the time, this is pure overhead, and it slows down one of *the* most common operations C extension code does. If C extensions checked object identity using some explicit operation like PyObject_Is() instead of comparing pointers, then PyPy could defer the expensive stuff until someone actually called PyObject_Is(). Note: numbers are made up and I have no idea how much overhead this actually adds. But I'm pretty sure this is the basic idea that Armin's talking about. -n -- Nathaniel J. Smith -- https://vorpus.org

Le mar. 27 nov. 2018 à 01:13, Larry Hastings <larry@hastings.org> a écrit :
(...) I'm not convinced the nice-to-have of "you can't dereference the pointer anymore" is worth this runtime overhead.
About the general idea of a new C API. If you only look at CPython in release mode, there is no benefit. But you should consider the overall picture: * ability to distribute a single binary for CPython in release mode, CPython in debug mode, PyPy, and maybe some new more funky Python runtimes * better performance on PyPy The question is if we can implement new optimizations in CPython (like tagged pointer) which would move the overall performance impact to at least "not significant" (not slower, not faster), or maybe even to "faster". Note: Again, in my plan, the new C API would be an opt-in API. The old C API would remain unchanged and fully supported. So there is no impact on performance if you consider to use the old C API. Victor

On 27Nov2018 0609, Victor Stinner wrote:
This is one of the things that makes me think your plan is not feasible. I *hope* that remaining on the old C API eventually has a performance impact, since the whole point is to enable new optimizations that currently require tricky emulation to remain compatible with the old API. If we never have to add any emulation for the old API, we haven't added anything useful for the new one. Over time, the old C API's performance (not functionality) should degrade as the new C API's performance increases. If the increase isn't significantly better than the degradation, the whole project can be declared a failure, as we would have been better off leaving the API alone and not changing anything. But this is great discussion. Looking forward to seeing some of it turn into reality :) Cheers, Steve

Hi Steve, On Tue, 27 Nov 2018 at 19:14, Steve Dower <steve.dower@python.org> wrote:
I can easily imagine the new API having two different implementations even for CPython: A) you can use the generic implementation, which produces a cross-python-compatible .so. All function calls go through the API at runtime. The same .so works on any version of CPython or PyPy. B) you can use a different set of headers or a #define or something, and you get a higher-performance version of your unmodified code---with the issue that the .so only runs on the exact version of CPython. This is done by defining some of the functions as macros. I would expect this version to be of similar speed than the current C API in most cases. This might give a way forward: people would initially port their extensions hoping to use the option B; once that is done, they can easily measure---not guess--- the extra performance costs of the option A, and decide based on actual data if the difference is really worth the additional troubles of distributing many versions. Even if it is, they can distribute an A version for PyPy and for unsupported CPython versions, and add a few B versions on top of that. ...Also, although I'm discussing it here, I think the whole approach would be better if done as a third-party extension for now, without requiring changes to CPython---just use the existing C API to implement the CPython version. The B option discussed above can even be mostly *just* a set of macros, with a bit of runtime that we might as well include in the produced .so in order to make it a standalone, regular CPython C extension module. A bientôt, Armin. PS: on CPython could use ``typedef struct { PyObject *_obj; } PyHandle;``. This works like a pointer, but you can't use ``==`` to compare them.

On Thu, Nov 29, 2018 at 5:10 PM Armin Rigo <armin.rigo@gmail.com> wrote:
And then you could have a macro or inline function to compare them, simply by looking at that private member, and it should compile down to the exact same machine code as comparing the original pointers directly. It'd be a not-unreasonable migration path, should you want to work that way - zero run-time cost. ChrisA

On 28Nov2018 2208, Armin Rigo wrote:
This makes sense, but unless it results in PyPy drastically gaining popularity as a production runtime, it basically leaves us in the status quo. We continue to not be able to change CPython internals at all, since that will break people using option B. Though potentially if we picked an official option for A, we could deprecate the stability of option B (over a few releases) and require people using it to thoroughly test, update and #ifdef their code for each version. That would allow us to make changes to the runtime while preserving option A as the reliable version. You might want to have a look at https://github.com/Microsoft/xlang/ which is not yet ready for showtime (in particular, there's no "make it look Pythonic" support yet), but is going to extend our existing cross-language ABI to Python (alongside C++/.NET/JS) and non-Windows platforms. It's been in use for years in Windows and has been just fine. (Sample generated output at https://github.com/devhawk/pywinrt-output/tree/master/generated/pyrt/src but the design docs at the first link are probably most interesting.) Cheers, Steve

Hi, On Thu, 29 Nov 2018 at 18:19, Steve Dower <steve.dower@python.org> wrote:
quo. We continue to not be able to change CPython internals at all, since that will break people using option B.
No? That will only break users if they only have an option-B ``foo.cpython-318m-x86_64-linux-gnu.so``, no option-A .so and no source code, and want to run it elsewhere than CPython 3.18. That's the same as today. If you want option-B .so for N versions of CPython, recompile the source code N times. Just to be clear, if done correctly there should be no need for #ifdefs in the source code of the extension module. A bientôt, Armin.

On 29Nov2018 2206, Armin Rigo wrote:
The problem is that if option B remains as compatible as it is today, we can't make option A faster enough to be attractive. The marketing pitch for this looks like: "rewrite all your existing code to be slower but works with PyPy, or don't rewrite your existing code and it'll be fastest with CPython and won't break in the future". This is status quo (where option A today is something like CFFI or Cython), and we can already see how many people have made the switch (FWIW, I totally prefer Cython over pure C for my own projects :) ). My proposed marketing pitch is: "rewrite your existing code to be forward-compatible today and faster in the future without more work, or be prepared to rewrite/update your source code for each CPython release to remain compatible with the low level API". The promise of "faster in the future" needs to be justified (and I think there's plenty of precedent in PyPy, Larry's Gilectomy and the various JavaScript VMs to assume that we can do it). We've already done enough investigation to know that making the runtime faster requires changing the low level APIs, and otherwise we're stuck in a local optima. Offering a stable, loosely coupled option A and then *planning* to break the low level APIs each version in the name of performance is the only realistic way to change what we're currently doing. Cheers, Steve

On Fri, 30 Nov 2018 09:22:30 -0800 Steve Dower <steve.dower@python.org> wrote:
I think that should be qualified. Technically it's certainly possible to have a faster CPython with different internals. Socially and organisationally I'm not sure we're equipped to achieve it. Regards Antoine.

Hi Steve, On 30/11/2018, Steve Dower <steve.dower@python.org> wrote:
Discussing marketing pitches on python-dev is not one of my favorite past-times, so I'll excuse myself out of this conversation. Instead, I might try to implement the basics, check out the performance on CPython and on PyPy, and seek out interest---I'm thinking about Cython, for example, which might relatively easily be adapted to generate that kind of code. This might be a solution for the poor performance of Cython on PyPy... If everything works out, maybe I'll come back here at some point, with the argument "the CPython C API is blocking CPython from evolution more and more? Here's one possible path forward." A bientôt, Armin.

On 2018-11-29, Armin Rigo wrote:
Hello Armin, Thank you for providing your input on this subject. I too like the idea of an API "shim layer" as a separate project. What do you think of writing the shim layer in C++? I'm not a C++ programmer but my understanding is that modern C++ compilers are much better than years ago. Using C++ would allow us to provide a higher level API with smaller runtime costs. However, it would require that any project using the shim layer would have to be compiled with a C++ compiler (CPython and PyPy could still expose a C compatible API). Perhaps it is a bad idea. If someone does create such a shim layer, it will already be challenging to convince extension authors to move to it. If it requires them to switch to using a C++ compiler rather than a C compiler, maybe that's too much effort. OTOH, with C++ I think you could do things like use smart pointers to automatically handle refcounts on the handles. Or maybe we should just skip C++ and implement the layer in Rust. Then the Rust borrow checker can handle the refcounts. ;-) Regards, Neil

On Fri, 30 Nov 2018 13:06:11 -0600 Neil Schemenauer <nas-python@arctrix.com> wrote:
The main problem with exposing a C++ *API* is that all people implementing that API suddenly must understand and implement the C++ *ABI* (with itself varies from platform to platform :-)). That's trivially easy if your implementation is itself written in C++, but not if it's written in something else such as RPython, Java, Rust, etc. C is really the lingua franca when exposing an interface that can be understood, implemented and/or interfaced with from many different languages. So I'd turn the proposal on its head: you can implement the internals of your interpreter or object layer in C++ (and indeed I think it would be crazy to start writing a new Python VM in raw C), but you should still expose a C-compatible API for third-party providers and consumers. Regards Antoine.

On 30Nov2018 1133, Antoine Pitrou wrote:
I totally agree with Antoine here. C++ is great for internals, but not the public interfaces. The one additional point I'd add is that there are other ABIs that C++ can use (such as xlang, Corba and COM), which can provide stability in ways the plain-old C++ ABI does not. So we wouldn't necessarily have to design a new C-based ABI for this, we could adopt an existing one that is already proven and already has supporting tools. Cheers, Steve

On Mon, Nov 26, 2018 at 4:10 PM Larry Hastings <larry@hastings.org> wrote:
Heck, it'd be find if someones implementation (such as a simple shim for CPython's existing API) wants to internally keep a PyObject structure and have PyHandle's implementation just be a typecast from PyObject* to PyHandle. The real point is that a handle is opaque and cannot be depended on by any API _user_ as being a pointer. What it means behind the scenes of a given VM is left entirely up to the VM. When an API returns a handle, that is an implicit internal INCREF if a VM is reference counting. When code calls an API that consumes a handle by taking ownership of it for itself (Py_DECREF could be considered one of these if you have a Py_DECREF equivalent API) that means "I can no longer using this handle". Comparisons get documented as being invalid, pointing to the API to call for an identity check, but it is up to each implementation to decide if it wants to force the handles to be unique. Anyone depending on that behavior is being bad and should not be supported. -gps PS ... use C++ and you could actually make handle identity comparisons do the right thing...

Hi Hugo, hi all, On Sun, 18 Nov 2018 at 22:53, Hugh Fisher <hugo.fisher@gmail.com> wrote:
FWIW, a "handle" is typically something that users of an API store and pass around, and which can be used to do all operations on some object. It is whatever a specific implementation needs to describe references to an object. In the CPython C API, this is ``PyObject*``. I think that using "handle" for something more abstract is just going to create confusion. Also FWIW, my own 2 cents on the topic of changing the C API: let's entirely drop ``PyObject *`` and instead use more opaque handles---like a ``PyHandle`` that is defined as a pointer-sized C type but is not actually directly a pointer. The main difference this would make is that the user of the API cannot dereference anything from the opaque handle, nor directly compare handles with each other to learn about object identity. They would work exactly like Windows handles or POSIX file descriptors. These handles would be returned by C API calls, and would need to be closed when no longer used. Several different handles may refer to the same object, which stays alive for at least as long as there are open handles to it. Doing it this way would untangle the notion of objects from their actual implementation. In CPython objects would internally use reference counting, a handle is really just a PyObject pointer in disguise, and closing a handle decreases the reference counter. In PyPy we'd have a global table of "open objects", and a handle would be an index in that table; closing a handle means writing NULL into that table entry. No emulated reference counting needed: we simply use the existing GC to keep alive objects that are referenced from one or more table entries. The cost is limited to a single indirection. The C API would change a lot, so it's not reasonable to do that in the CPython repo. But it could be a third-party project, attempting to define an API like this and implement it well on top of both CPython and PyPy. IMHO this might be a better idea than just changing the API of functions defined long ago to make them more regular (e.g. stop returning borrowed references); by now this would mostly mean creating more work for the PyPy team to track and adapt to the changes, with no real benefits. A bientôt, Armin.

On Fri, 23 Nov 2018 at 23:24, Armin Rigo <armin.rigo@gmail.com> wrote (regarding opaque "handles" in the C API):
And the nice thing about doing it as a shim is that it can be applied to *existing* versions of CPython, rather than having to wait for new ones. Node.js started switching over to doing things this way last year, and its a good way to go about it: https://medium.com/the-node-js-collection/n-api-next-generation-node-js-apis... While this would still be a difficult project to pursue, and would suffer from many of the same barriers to adoption as CPython's native stable ABI, it does offer a concrete benefit to 3rd party module authors: being able to create single wheel files that can be shared across multiple Python versions, rather than needing to be built separately for each one. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi Stefan, On Sat, 24 Nov 2018 at 22:17, Stefan Behnel <stefan_ml@behnel.de> wrote:
Couldn't this also be achieved via reference counting? Count only in C space, and delete the "open object" when the refcount goes to 0?
The point is to remove the need to return the same handle to C code if the object is the same one. This saves one of the largest costs of the C API emulation, which is looking up the object in a big dictionary to know if there is already a ``PyObject *`` that corresponds to it or not---for *all* objects that go from Python to C. Once we do that, then there is no need for a refcount any more. Yes, you could add your custom refcount code in C, but in practice it is rarely done. For example, with POSIX file descriptors, when you would need to "incref" a file descriptor, you instead use dup(). This gives you a different file descriptor which can be closed independently of the original one, but they both refer to the same file. A bientôt, Armin.

Hi Armin, Armin Rigo schrieb am 25.11.18 um 06:15:
Ok, got it. And since the handle is a simple integer, there's also no additional cost for memory allocation on the way out.
Ok, then an INCREF() would be replaced by such a dup() call that creates and returns a new handle. In CPython, it would just INCREF and return the PyObject*, which is as fast as the current Py_INCREF(). For PyPy, however, that means that increfs become more costly. One of the outcomes of a recent experiment with tagged pointers for integers was that they make increfs and decrefs more expensive, and (IIUC) that reduced the overall performance quite visibly. In the case of pointers, it's literally just adding a tiny condition that makes this so much slower. In the case of handles, it would add a lookup and a reference copy in the handles array. That's way more costly already than just the simple condition. Now, it's unclear if this performance degredation is specific to CPython (where PyObject* is native), or if it would also apply to PyPy. But I guess the only way to find this out would be to try it. IIUC, the only thing that is needed is to replace Py_INCREF(obj); with obj = Py_NEWREF(obj); which CPython would implement as #define Py_NEWREF(obj) (Py_INCREF(obj), obj) Py_DECREF() would then just invalidate and clean up the handle under the hood. There are probably some places in user code where this would end up leaking a reference by accident because of unclean reference handling (it could overwrite the old handle in the case of a temporary INCREF/DECREF cycle), but it might still be enough for trying it out. We could definitely switch to this pattern in Cython (in fact, we already use such a NEWREF macro in a couple of places, since it's a common pattern). Overall, this seems like something that PyPy could try out as an experiment, by just taking a simple extension module and replacing all increfs with newref assignments. And obviously implementing the whole thing for the C-API, but IIUC, you might be able to tweak that into your cpyext wrapping layer somehow, without manually rewriting all C-API functions? Stefan

Hi, On Sun, 25 Nov 2018 at 10:15, Stefan Behnel <stefan_ml@behnel.de> wrote:
Just to be clear, I suggested making a new API, not just tweaking Py_INCREF() and hoping that all the rest works as it is. I'm skeptical about that. To start with, a ``Py_NEWREF()`` like you describe *will* lead people just renaming all ``Py_INCREF()`` to ``Py_NEWREF()`` ignoring the return value, because that's the easiest change and it would work fine on CPython. A bientôt, Armin.

Armin Rigo schrieb am 26.11.18 um 06:37:
Oh, I'm not skeptical at all. I'm actually sure that it's not that easy. I would guess that such an automatic transformation should work in something like 70% of the cases. Another 25% should be trivial to fix manually, and the remaining 5% … well. They can probably still be changed with some thinking and refactoring. That also involves cases where pointer equality is used to detect object identity. Having a macro for that might be a good idea. Overall, relatively easy. And therefore not unlikely to happen. The lower the bar, the more likely we will see adoption. Also note that explicit Py_INCREF() calls are actually not that common. I just checked and found only 465 calls in 124K lines of Cython generated C code for Cython itself, and 725 calls in 348K C lines of lxml. Not exactly a snap, but definitely not huge. All other objects originate from the C-API in one way or another, which you control.
First of all, as long as Py_INCREF() is not going away, they probably won't change anything. Therefore, before we discuss how laziness will hinder the adoption, I would rather like to see an actual motivation for them to do it. And since this change seems to have zero advantages in CPython, but adds a tiny bit of complexity, I think it's now up to PyPy to show that this added complexity has an advantage that is large enough to motivates it. If you could come up with a prototype that demonstrates the advantage (or at least uncovers the problems we'd face), we could actually discuss about real solutions rather than uncertain ideas. Stefan

On Fri, Nov 23, 2018 at 2:22 PM Armin Rigo <armin.rigo@gmail.com> wrote:
+1 As another point of reference, if you're interested, I've been working lately on the special purpose computer algebra system GAP. It also uses an approach like this: Objects are referenced throughout via an opaque "Obj" type (which is really just a typedef of "Bag", the internal storage reference handle of its "GASMAN" garbage collector [1]). A nice benefit of this, along with the others discussed above, is that it has being relatively easy to replace the garbage collector in GAP--there are options for it to use Boehm-GC, as well as Julia's GC. GAP has its own problems, but it's relatively simple and has been inspiring to look at; I was coincidentally wondering just recently if there's anything Python could take from it (conversely, I'm trying to bring some things I've learned from Python to improve GAP...). [1] https://github.com/gap-system/gap/blob/master/src/gasman.c

On 11/23/18 5:15 AM, Armin Rigo wrote:
Why would this be better than simply returning the pointer? Sure, it prevents ever dereferencing the pointer and messing with the object, it is true. So naughty people would be prevented from messing with the object directly instead of using the API as they should. But my understanding is that the implementation would be slightly slower--there'd be all that looking up objects based on handles, and managing the handle namespace too. I'm not convinced the nice-to-have of "you can't dereference the pointer anymore" is worth this runtime overhead. Or maybe you have something pretty cheap in mind, e.g. "handle = pointer ^ 49"? Or even "handle = pointer ^ (random odd number picked at startup)" to punish the extra-naughty? //arry/

On 11/26/2018 7:08 PM, Larry Hastings wrote:
I thought the important part of the proposal was to have multiple PyHandles that point to the same PyObject (you couldn't "directly compare handles with each other to learn about object identity"). But I'll admit I'm not sure why this would be a win. Then of course they couldn't be regular pointers. Eric

On Mon, Nov 26, 2018 at 6:12 PM Eric V. Smith <eric@trueblade.com> wrote:
Whenever PyPy passes an object from PyPy -> C, then it has to invent a "PyObject*" to represent the PyPy object. 0.1% of the time, the C code will use C pointer comparison to implement an "is" check on this PyObject*. But PyPy doesn't know which 0.1% of the time this will happen, so 100% of the time an object goes from PyPy -> C, PyPy has to check and update some global intern table to figure out whether this particular object has ever made the transition before and use the same PyObject*. 99.9% of the time, this is pure overhead, and it slows down one of *the* most common operations C extension code does. If C extensions checked object identity using some explicit operation like PyObject_Is() instead of comparing pointers, then PyPy could defer the expensive stuff until someone actually called PyObject_Is(). Note: numbers are made up and I have no idea how much overhead this actually adds. But I'm pretty sure this is the basic idea that Armin's talking about. -n -- Nathaniel J. Smith -- https://vorpus.org

Le mar. 27 nov. 2018 à 01:13, Larry Hastings <larry@hastings.org> a écrit :
(...) I'm not convinced the nice-to-have of "you can't dereference the pointer anymore" is worth this runtime overhead.
About the general idea of a new C API. If you only look at CPython in release mode, there is no benefit. But you should consider the overall picture: * ability to distribute a single binary for CPython in release mode, CPython in debug mode, PyPy, and maybe some new more funky Python runtimes * better performance on PyPy The question is if we can implement new optimizations in CPython (like tagged pointer) which would move the overall performance impact to at least "not significant" (not slower, not faster), or maybe even to "faster". Note: Again, in my plan, the new C API would be an opt-in API. The old C API would remain unchanged and fully supported. So there is no impact on performance if you consider to use the old C API. Victor

On 27Nov2018 0609, Victor Stinner wrote:
This is one of the things that makes me think your plan is not feasible. I *hope* that remaining on the old C API eventually has a performance impact, since the whole point is to enable new optimizations that currently require tricky emulation to remain compatible with the old API. If we never have to add any emulation for the old API, we haven't added anything useful for the new one. Over time, the old C API's performance (not functionality) should degrade as the new C API's performance increases. If the increase isn't significantly better than the degradation, the whole project can be declared a failure, as we would have been better off leaving the API alone and not changing anything. But this is great discussion. Looking forward to seeing some of it turn into reality :) Cheers, Steve

Hi Steve, On Tue, 27 Nov 2018 at 19:14, Steve Dower <steve.dower@python.org> wrote:
I can easily imagine the new API having two different implementations even for CPython: A) you can use the generic implementation, which produces a cross-python-compatible .so. All function calls go through the API at runtime. The same .so works on any version of CPython or PyPy. B) you can use a different set of headers or a #define or something, and you get a higher-performance version of your unmodified code---with the issue that the .so only runs on the exact version of CPython. This is done by defining some of the functions as macros. I would expect this version to be of similar speed than the current C API in most cases. This might give a way forward: people would initially port their extensions hoping to use the option B; once that is done, they can easily measure---not guess--- the extra performance costs of the option A, and decide based on actual data if the difference is really worth the additional troubles of distributing many versions. Even if it is, they can distribute an A version for PyPy and for unsupported CPython versions, and add a few B versions on top of that. ...Also, although I'm discussing it here, I think the whole approach would be better if done as a third-party extension for now, without requiring changes to CPython---just use the existing C API to implement the CPython version. The B option discussed above can even be mostly *just* a set of macros, with a bit of runtime that we might as well include in the produced .so in order to make it a standalone, regular CPython C extension module. A bientôt, Armin. PS: on CPython could use ``typedef struct { PyObject *_obj; } PyHandle;``. This works like a pointer, but you can't use ``==`` to compare them.

On Thu, Nov 29, 2018 at 5:10 PM Armin Rigo <armin.rigo@gmail.com> wrote:
And then you could have a macro or inline function to compare them, simply by looking at that private member, and it should compile down to the exact same machine code as comparing the original pointers directly. It'd be a not-unreasonable migration path, should you want to work that way - zero run-time cost. ChrisA

On 28Nov2018 2208, Armin Rigo wrote:
This makes sense, but unless it results in PyPy drastically gaining popularity as a production runtime, it basically leaves us in the status quo. We continue to not be able to change CPython internals at all, since that will break people using option B. Though potentially if we picked an official option for A, we could deprecate the stability of option B (over a few releases) and require people using it to thoroughly test, update and #ifdef their code for each version. That would allow us to make changes to the runtime while preserving option A as the reliable version. You might want to have a look at https://github.com/Microsoft/xlang/ which is not yet ready for showtime (in particular, there's no "make it look Pythonic" support yet), but is going to extend our existing cross-language ABI to Python (alongside C++/.NET/JS) and non-Windows platforms. It's been in use for years in Windows and has been just fine. (Sample generated output at https://github.com/devhawk/pywinrt-output/tree/master/generated/pyrt/src but the design docs at the first link are probably most interesting.) Cheers, Steve

Hi, On Thu, 29 Nov 2018 at 18:19, Steve Dower <steve.dower@python.org> wrote:
quo. We continue to not be able to change CPython internals at all, since that will break people using option B.
No? That will only break users if they only have an option-B ``foo.cpython-318m-x86_64-linux-gnu.so``, no option-A .so and no source code, and want to run it elsewhere than CPython 3.18. That's the same as today. If you want option-B .so for N versions of CPython, recompile the source code N times. Just to be clear, if done correctly there should be no need for #ifdefs in the source code of the extension module. A bientôt, Armin.

On 29Nov2018 2206, Armin Rigo wrote:
The problem is that if option B remains as compatible as it is today, we can't make option A faster enough to be attractive. The marketing pitch for this looks like: "rewrite all your existing code to be slower but works with PyPy, or don't rewrite your existing code and it'll be fastest with CPython and won't break in the future". This is status quo (where option A today is something like CFFI or Cython), and we can already see how many people have made the switch (FWIW, I totally prefer Cython over pure C for my own projects :) ). My proposed marketing pitch is: "rewrite your existing code to be forward-compatible today and faster in the future without more work, or be prepared to rewrite/update your source code for each CPython release to remain compatible with the low level API". The promise of "faster in the future" needs to be justified (and I think there's plenty of precedent in PyPy, Larry's Gilectomy and the various JavaScript VMs to assume that we can do it). We've already done enough investigation to know that making the runtime faster requires changing the low level APIs, and otherwise we're stuck in a local optima. Offering a stable, loosely coupled option A and then *planning* to break the low level APIs each version in the name of performance is the only realistic way to change what we're currently doing. Cheers, Steve

On Fri, 30 Nov 2018 09:22:30 -0800 Steve Dower <steve.dower@python.org> wrote:
I think that should be qualified. Technically it's certainly possible to have a faster CPython with different internals. Socially and organisationally I'm not sure we're equipped to achieve it. Regards Antoine.

Hi Steve, On 30/11/2018, Steve Dower <steve.dower@python.org> wrote:
Discussing marketing pitches on python-dev is not one of my favorite past-times, so I'll excuse myself out of this conversation. Instead, I might try to implement the basics, check out the performance on CPython and on PyPy, and seek out interest---I'm thinking about Cython, for example, which might relatively easily be adapted to generate that kind of code. This might be a solution for the poor performance of Cython on PyPy... If everything works out, maybe I'll come back here at some point, with the argument "the CPython C API is blocking CPython from evolution more and more? Here's one possible path forward." A bientôt, Armin.

On 2018-11-29, Armin Rigo wrote:
Hello Armin, Thank you for providing your input on this subject. I too like the idea of an API "shim layer" as a separate project. What do you think of writing the shim layer in C++? I'm not a C++ programmer but my understanding is that modern C++ compilers are much better than years ago. Using C++ would allow us to provide a higher level API with smaller runtime costs. However, it would require that any project using the shim layer would have to be compiled with a C++ compiler (CPython and PyPy could still expose a C compatible API). Perhaps it is a bad idea. If someone does create such a shim layer, it will already be challenging to convince extension authors to move to it. If it requires them to switch to using a C++ compiler rather than a C compiler, maybe that's too much effort. OTOH, with C++ I think you could do things like use smart pointers to automatically handle refcounts on the handles. Or maybe we should just skip C++ and implement the layer in Rust. Then the Rust borrow checker can handle the refcounts. ;-) Regards, Neil

On Fri, 30 Nov 2018 13:06:11 -0600 Neil Schemenauer <nas-python@arctrix.com> wrote:
The main problem with exposing a C++ *API* is that all people implementing that API suddenly must understand and implement the C++ *ABI* (with itself varies from platform to platform :-)). That's trivially easy if your implementation is itself written in C++, but not if it's written in something else such as RPython, Java, Rust, etc. C is really the lingua franca when exposing an interface that can be understood, implemented and/or interfaced with from many different languages. So I'd turn the proposal on its head: you can implement the internals of your interpreter or object layer in C++ (and indeed I think it would be crazy to start writing a new Python VM in raw C), but you should still expose a C-compatible API for third-party providers and consumers. Regards Antoine.

On 30Nov2018 1133, Antoine Pitrou wrote:
I totally agree with Antoine here. C++ is great for internals, but not the public interfaces. The one additional point I'd add is that there are other ABIs that C++ can use (such as xlang, Corba and COM), which can provide stability in ways the plain-old C++ ABI does not. So we wouldn't necessarily have to design a new C-based ABI for this, we could adopt an existing one that is already proven and already has supporting tools. Cheers, Steve

I just would want to say that I'm very happy to read the discussions about the C API finally happening on python-dev :-) The discussion is very interesting!
C is really the lingua franca
Sorry, but why not writing the API directly in french? Victor

On Mon, Nov 26, 2018 at 4:10 PM Larry Hastings <larry@hastings.org> wrote:
Heck, it'd be find if someones implementation (such as a simple shim for CPython's existing API) wants to internally keep a PyObject structure and have PyHandle's implementation just be a typecast from PyObject* to PyHandle. The real point is that a handle is opaque and cannot be depended on by any API _user_ as being a pointer. What it means behind the scenes of a given VM is left entirely up to the VM. When an API returns a handle, that is an implicit internal INCREF if a VM is reference counting. When code calls an API that consumes a handle by taking ownership of it for itself (Py_DECREF could be considered one of these if you have a Py_DECREF equivalent API) that means "I can no longer using this handle". Comparisons get documented as being invalid, pointing to the API to call for an identity check, but it is up to each implementation to decide if it wants to force the handles to be unique. Anyone depending on that behavior is being bad and should not be supported. -gps PS ... use C++ and you could actually make handle identity comparisons do the right thing...
participants (15)
-
Antoine Pitrou
-
Armin Rigo
-
Chris Angelico
-
E. Madison Bray
-
Eric V. Smith
-
Gregory P. Smith
-
Hugh Fisher
-
Larry Hastings
-
MRAB
-
Nathaniel Smith
-
Neil Schemenauer
-
Nick Coghlan
-
Stefan Behnel
-
Steve Dower
-
Victor Stinner