new PEP: implementation independent native code invocation and data exchange ABI standard (not sure if accurate)
Dear All Python is a "glue" language, its dynamic nature has programming advantage and performance disadvantage. The best way to use Python is write Python code for high-level stuff and use native programming language like C, Rust, Zig, V for low level stuff. Currently, there is library like PyO3 and so on. However there is a lot of limitation in those library. - The "FFI" library is implementation specific. For example, PyO3, it highly depend on specific version of CPython. If you use a different version of CPython, there is some work for you to do. If you don't use CPython, for example use another Python Implementation, This framework may not work. - The project structure is rigid. When using something like PyO3. you must make your project structure following certain pattern, then finally produce a complete wheel package. The majority Python programmer does not write a whole Python project for a installable package and pack it up then install it. They are edit Python source file iteratively and run it locally. - It is really ridiculous when you want to stick some item on the wall. You need to totally redesign this item and manufacture a new item to fit the glue you are going to use. As a glue language, Python should be designed to glue other native programming language as a feature of Python programming language itself not the tricks of certain variant of implementation. It would be nice to add the feature in Python Standards (no matter what implementation is used) to satisfy the following capabilities: - The interface is universal across all variant and version of Python implementation (There might be protocol version update and may not backward-compatible, but it is not bounded with Python implementation). The overall effect is in some extent like JSON, but it is not a structured string, It is a lively data structure with in-memory representation, they are unified no matter what Python variant is used and what low level native language is used. - This mechanism is transparent to users, there is modules in standard library to support it. If user want, they can design a toy native programming language, and use Python to write a compiler for it. Then write a module with custom language then compile and import it. This mechanism provide user with maximum flexibility. - almost zero-cost abstraction. Even if it not depend on CPython tricks. But the central idea of this mechanism is still the dynamic linking feature provided by operating system. The detailed format will be slightly different. It doesn't introduce new stuff in nature. It doesn't spawn new process, nor launch a VM, and nor I/O operations is involved. It just make some basic data representation conversion and invoke the method in dynamic library. This is just a raw idea. If it is valuable, it can take a discussion and make further steps. Thanks
At the Language Summit a few days ago we discussed this problem. I wasn't involved in this discussion, but it's a hard one to solve. You may be interested in HPy (https://hpyproject.org/), which aims to provide what you are looking for. El dom, 23 abr 2023 a las 6:18, Evan Greenup via Python-ideas (< python-ideas@python.org>) escribió:
Dear All
Python is a "glue" language, its dynamic nature has programming advantage and performance disadvantage. The best way to use Python is write Python code for high-level stuff and use native programming language like C, Rust, Zig, V for low level stuff. Currently, there is library like PyO3 and so on. However there is a lot of limitation in those library.
1. The "FFI" library is implementation specific. For example, PyO3, it highly depend on specific version of CPython. If you use a different version of CPython, there is some work for you to do. If you don't use CPython, for example use another Python Implementation, This framework may not work. 2. The project structure is rigid. When using something like PyO3. you must make your project structure following certain pattern, then finally produce a complete wheel package. The majority Python programmer does not write a whole Python project for a installable package and pack it up then install it. They are edit Python source file iteratively and run it locally. 3. It is really ridiculous when you want to stick some item on the wall. You need to totally redesign this item and manufacture a new item to fit the glue you are going to use. As a glue language, Python should be designed to glue other native programming language as a feature of Python programming language itself not the tricks of certain variant of implementation.
It would be nice to add the feature in Python Standards (no matter what implementation is used) to satisfy the following capabilities:
- The interface is universal across all variant and version of Python implementation (There might be protocol version update and may not backward-compatible, but it is not bounded with Python implementation). The overall effect is in some extent like JSON, but it is not a structured string, It is a lively data structure with in-memory representation, they are unified no matter what Python variant is used and what low level native language is used. - This mechanism is transparent to users, there is modules in standard library to support it. If user want, they can design a toy native programming language, and use Python to write a compiler for it. Then write a module with custom language then compile and import it. This mechanism provide user with maximum flexibility. - almost zero-cost abstraction. Even if it not depend on CPython tricks. But the central idea of this mechanism is still the dynamic linking feature provided by operating system. The detailed format will be slightly different. It doesn't introduce new stuff in nature. It doesn't spawn new process, nor launch a VM, and nor I/O operations is involved. It just make some basic data representation conversion and invoke the method in dynamic library.
This is just a raw idea. If it is valuable, it can take a discussion and make further steps.
Thanks _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/R2NBDZ... Code of Conduct: http://python.org/psf/codeofconduct/
Thus is x very hard problem— bug people are trying. https://www.pypy.org/posts/2018/09/inside-cpyext-why-emulating-cpython-c-808... -CHB On Sun, Apr 23, 2023 at 5:39 AM Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote:
At the Language Summit a few days ago we discussed this problem. I wasn't involved in this discussion, but it's a hard one to solve.
You may be interested in HPy (https://hpyproject.org/), which aims to provide what you are looking for.
El dom, 23 abr 2023 a las 6:18, Evan Greenup via Python-ideas (< python-ideas@python.org>) escribió:
Dear All
Python is a "glue" language, its dynamic nature has programming advantage and performance disadvantage. The best way to use Python is write Python code for high-level stuff and use native programming language like C, Rust, Zig, V for low level stuff. Currently, there is library like PyO3 and so on. However there is a lot of limitation in those library.
1. The "FFI" library is implementation specific. For example, PyO3, it highly depend on specific version of CPython. If you use a different version of CPython, there is some work for you to do. If you don't use CPython, for example use another Python Implementation, This framework may not work. 2. The project structure is rigid. When using something like PyO3. you must make your project structure following certain pattern, then finally produce a complete wheel package. The majority Python programmer does not write a whole Python project for a installable package and pack it up then install it. They are edit Python source file iteratively and run it locally. 3. It is really ridiculous when you want to stick some item on the wall. You need to totally redesign this item and manufacture a new item to fit the glue you are going to use. As a glue language, Python should be designed to glue other native programming language as a feature of Python programming language itself not the tricks of certain variant of implementation.
It would be nice to add the feature in Python Standards (no matter what implementation is used) to satisfy the following capabilities:
- The interface is universal across all variant and version of Python implementation (There might be protocol version update and may not backward-compatible, but it is not bounded with Python implementation). The overall effect is in some extent like JSON, but it is not a structured string, It is a lively data structure with in-memory representation, they are unified no matter what Python variant is used and what low level native language is used. - This mechanism is transparent to users, there is modules in standard library to support it. If user want, they can design a toy native programming language, and use Python to write a compiler for it. Then write a module with custom language then compile and import it. This mechanism provide user with maximum flexibility. - almost zero-cost abstraction. Even if it not depend on CPython tricks. But the central idea of this mechanism is still the dynamic linking feature provided by operating system. The detailed format will be slightly different. It doesn't introduce new stuff in nature. It doesn't spawn new process, nor launch a VM, and nor I/O operations is involved. It just make some basic data representation conversion and invoke the method in dynamic library.
This is just a raw idea. If it is valuable, it can take a discussion and make further steps.
Thanks _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/R2NBDZ... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WZKFVN... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
This isn't a PEP yet: it's a set of requirements. A PEP eventually needs to say how to implement the requirements, and even at this "proto-PEP" stage, it needs to be plausible that it's implementable. It's on you to explain how your very ambitious requirements can be satisfied in Python. Nobody's going to ask you for an implementation, but references to related tech like HPy and ctypes combined with discussion of how they do or don't meet your requirements would be helpful. 2023-04-23 21:13 に Evan Greenup via Python-ideas さんは書きました:
However there is a lot of limitation in those library [like PyO3].
For good reason. Python does not share data structures with those other languages. CPython's native data structures are a subset of those implemented in C -- obviously, because CPython is implemented in C. But other languages will implement "mutable extensible memory-safe sequence" (ie, Python's list) in different ways (and famously C doesn't provide that!) On the other hand, data structures in other languages may have no built-in equivalent in Python. The low-level cytpes stdlib module provides the flexibility you want, but it's implementation-dependent on the foreign side, and must be.
* The project structure is rigid.
This is a complaint about specific third-party libraries that provide high- level wrapping of a fundamentally low-level facility. I would *expect* the project structure to be rigid. I think you might find it easier to present the proposal convincingly if you "build up" from ctypes, instead of "building down" from PyO3.
* It is really ridiculous when you want to stick some item on the wall. You need to totally redesign this item and manufacture a new item to fit the glue you are going to use.
This is not true of ctypes, which is designed as the thinnest possible wrapper around other languages (specifically C, but to the extent that most C implementations provide facilities for calling FORTRAN and other such languages, it should be possible to extend ctypes to those languages in that way).
As a glue language, Python should be designed to glue other native programming language as a feature of Python programming language itself
As far as I can see this is not feasible, and vastly overemphasizes Python's role as a glue language. Python is a programming language first, and the business of the Python programming language is to be Python. Interfacing to other languages is going to be more or less painful depending on how closely the internals are related, and in general it will be hardware-dependent.
* The interface is universal across all variant and version of Python implementation
It took Microsoft 15 versions and a couple of decades to manage this with just its own runtime library. Remember, not only do Python internals change from version to version, but so do those of other languages. C++ is infamous for incompatibility, in fact. It's hard to imagine that the Python side of the interface can be completely independent of the other language, when the whole point is that the other language has specific features that *Python does not*. As Jelle mentions, a standardized ABI for Python is in process, the current iteration being the HPy project. However AIUI the goal is a consistent ABI across Python versions, not making construction of FFIs easier. All it should do I believe is remove ABI compatibility across Python versions from the set of problems an FFI needs to deal with. That's useful, but doesn't remove any of the complexity caused by different representations in the target language.
It is a lively data structure with in-memory representation, they are unified no matter what Python variant is used and what low level native language is used.
So you're suggesting an intermediate data representation, likely requiring two translations (Python to intermediate and intermediate to target language) each time data is to be transferred from Python to a target language and back. If HPy succeeds then that ABI can be frozen as both the Python ABI and the inter- mediate representation, of course. However, consider the C++ standard template library. In Python, everything is an object, with a consistent handle. Lists and tuples are uniform sequences of handles, dicts are uniform handle-to-handle mappings. That's not so in the C++ template library. The whole point is to provide individual routines optimized to each variant of a data structure based on the template's type variables. It's one-many, not one-one, from the point of view of your proposed FFI ABI. It seems to me that more than Python's internal representation, which changes fairly slowly and is pretty well-documented, the target ABI is more variable, and if the target is as low-level as C there is not going to be one because equivalents of Python structures will be defined per-project rather than for all C libraries.
* This mechanism is transparent to users, there is modules in standard library to support it. [...] This mechanism provide user with maximum flexibility.
That sounds like ctypes to me.
* almost zero-cost abstraction.
What does that mean?
It just make some basic data representation conversion and invoke the method in dynamic library.
Still sounds like ctypes to me. So I come back to the theme: what do you want that ctypes doesn't provide? https://docs.python.org/3/library/ctypes.html
On Sun, Apr 23, 2023, 3:43 PM turnbull <turnbull@sk.tsukuba.ac.jp> wrote:
This isn't a PEP yet: it's a set of requirements. A PEP eventually needs to say how to implement the requirements, and even at this "proto-PEP" stage, it needs to be plausible that it's implementable. It's on you to explain how your very ambitious requirements can be satisfied in Python. Nobody's going to ask you for an implementation, but references to related tech like HPy and ctypes combined with discussion of how they do or don't meet your requirements would be helpful.
2023-04-23 21:13 に Evan Greenup via Python-ideas さんは書きました:
However there is a lot of limitation in those library [like PyO3].
For good reason. Python does not share data structures with those other languages. CPython's native data structures are a subset of those implemented in C -- obviously, because CPython is implemented in C. But other languages will implement "mutable extensible memory-safe sequence" (ie, Python's list) in different ways (and famously C doesn't provide that!) On the other hand, data structures in other languages may have no built-in equivalent in Python. The low-level cytpes stdlib module provides the flexibility you want, but it's implementation-dependent on the foreign side, and must be.
* The project structure is rigid.
This is a complaint about specific third-party libraries that provide high- level wrapping of a fundamentally low-level facility. I would *expect* the project structure to be rigid.
I think you might find it easier to present the proposal convincingly if you "build up" from ctypes, instead of "building down" from PyO3.
* It is really ridiculous when you want to stick some item on the wall. You need to totally redesign this item and manufacture a new item to fit the glue you are going to use.
This is not true of ctypes, which is designed as the thinnest possible wrapper around other languages (specifically C, but to the extent that most C implementations provide facilities for calling FORTRAN and other such languages, it should be possible to extend ctypes to those languages in that way).
As a glue language, Python should be designed to glue other native programming language as a feature of Python programming language itself
As far as I can see this is not feasible, and vastly overemphasizes Python's role as a glue language. Python is a programming language first, and the business of the Python programming language is to be Python. Interfacing to other languages is going to be more or less painful depending on how closely the internals are related, and in general it will be hardware-dependent.
* The interface is universal across all variant and version of Python implementation
It took Microsoft 15 versions and a couple of decades to manage this with just its own runtime library. Remember, not only do Python internals change from version to version, but so do those of other languages. C++ is infamous for incompatibility, in fact. It's hard to imagine that the Python side of the interface can be completely independent of the other language, when the whole point is that the other language has specific features that *Python does not*.
As Jelle mentions, a standardized ABI for Python is in process, the current iteration being the HPy project. However AIUI the goal is a consistent ABI across Python versions, not making construction of FFIs easier. All it should do I believe is remove ABI compatibility across Python versions from the set of problems an FFI needs to deal with. That's useful, but doesn't remove any of the complexity caused by different representations in the target language.
It is a lively data structure with in-memory representation, they are unified no matter what Python variant is used and what low level native language is used.
So you're suggesting an intermediate data representation, likely requiring two translations (Python to intermediate and intermediate to target language) each time data is to be transferred from Python to a target language and back. If HPy succeeds then that ABI can be frozen as both the Python ABI and the inter- mediate representation, of course.
Arrow does zero-copy with nested ~Structs + Schema. Have you already considered Apache Arrow? https://arrow.apache.org/ https://github.com/apache/arrow https://arrow.apache.org/ : """ ## What is Arrow? ### Format Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead. Learn more about the design or read the specification. ### Libraries Arrow's libraries implement the format and provide building blocks for a range of use cases, including high performance analytics. Many popular projects use Arrow to ship columnar data efficiently or as the basis for analytic engines. Libraries are available for C, C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust. See how to install and """ However, consider the C++ standard template library. In Python,
everything is an object, with a consistent handle. Lists and tuples are uniform sequences of handles, dicts are uniform handle-to-handle mappings. That's not so in the C++ template library. The whole point is to provide individual routines optimized to each variant of a data structure based on the template's type variables. It's one-many, not one-one, from the point of view of your proposed FFI ABI. It seems to me that more than Python's internal representation, which changes fairly slowly and is pretty well-documented, the target ABI is more variable, and if the target is as low-level as C there is not going to be one because equivalents of Python structures will be defined per-project rather than for all C libraries.
* This mechanism is transparent to users, there is modules in standard library to support it. [...] This mechanism provide user with maximum flexibility.
That sounds like ctypes to me.
* almost zero-cost abstraction.
What does that mean?
It just make some basic data representation conversion and invoke the method in dynamic library.
Still sounds like ctypes to me.
So I come back to the theme: what do you want that ctypes doesn't provide? https://docs.python.org/3/library/ctypes.html
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IEHRGP... Code of Conduct: http://python.org/psf/codeofconduct/
participants (5)
-
Christopher Barker
-
Evan Greenup
-
Jelle Zijlstra
-
turnbull
-
Wes Turner