Automatic translation of Python to assembly language
Hello to the list, I have an idea for Python that is non-traditional in that it doesn’t extend or modify existing Python language structure. The idea uses Python to translate Python, entirely under program control, directly to optimized assembly language .dll or .so files, called “extensions.” Extensions are called from Python using Python’s ctypes interface. The ctypes wrapper for each extension is created automatically. The goal of this idea is for Python to perform as fast or faster than C or C++, without leaving Python. Details: 1. Large project – now 23,556 lines of Python. 2. Evolved from a project that automatically translated APL to assembly language dlls -- more than 30,000 hours of development in APL and assembly language. 3. Solves the tremendous problem of coding assembly by hand. 4. Point-and-click interface. 5. Ahead-of-time compilation. 6. Python translated directly to assembly language – no third-party compiler (GCC, LLVM, Clang, etc.) or intermediate representation. 7. Advanced assembly language optimizations: registers, SIMD, multicore, loop fusion, loop unrolling, etc., custom-fitted to Python. 8. No Global Interpreter Lock issues – ctypes releases the GIL. Extensions have full use of all threads and cores. 9. NumPy and SciPy functions, as well as Python built-in functions and built-in library functions, translated directly to optimized assembly language to avoid expensive Python callbacks. 10. Memory safe: a. controls buffer access and frees every memory pointer when the extension returns from assembly to Python b. handles bounds checking on variables and arrays passed into the extension by ctypes c. extensions will not encounter errors such as buffer overflows, buffer over-reads, or memory race conditions d. handles recursive programs with its own stack, thus avoiding stack exhaustion for recursive programs More details at https://PysoniQ.com: A video demonstration Try out the point-and-click interface at the “Try PysoniQ” link A detailed Project Overview Technical FAQs Blog and speed metric links for deeper analysis of the technologies Downloadable PDFs – see the Resources link Any comments from the Python community on this project would be most appreciated! Thank you. Mark mark@pysoniq.com
This isn't the place for ads for commercial products.
On 7 Sep 2019, at 22:19, Mark @pysoniq <mark@pysoniq.com> wrote:
Hello to the list,
I have an idea for Python that is non-traditional in that it doesn’t extend or modify existing Python language structure.
The idea uses Python to translate Python, entirely under program control, directly to optimized assembly language .dll or .so files, called “extensions.”
Extensions are called from Python using Python’s ctypes interface. The ctypes wrapper for each extension is created automatically.
The goal of this idea is for Python to perform as fast or faster than C or C++, without leaving Python.
Details:
1. Large project – now 23,556 lines of Python.
2. Evolved from a project that automatically translated APL to assembly language dlls -- more than 30,000 hours of development in APL and assembly language.
3. Solves the tremendous problem of coding assembly by hand.
4. Point-and-click interface.
5. Ahead-of-time compilation.
6. Python translated directly to assembly language – no third-party compiler (GCC, LLVM, Clang, etc.) or intermediate representation.
7. Advanced assembly language optimizations: registers, SIMD, multicore, loop fusion, loop unrolling, etc., custom-fitted to Python.
8. No Global Interpreter Lock issues – ctypes releases the GIL. Extensions have full use of all threads and cores.
9. NumPy and SciPy functions, as well as Python built-in functions and built-in library functions, translated directly to optimized assembly language to avoid expensive Python callbacks.
10. Memory safe:
a. controls buffer access and frees every memory pointer when the extension returns from assembly to Python
b. handles bounds checking on variables and arrays passed into the extension by ctypes
c. extensions will not encounter errors such as buffer overflows, buffer over-reads, or memory race conditions
d. handles recursive programs with its own stack, thus avoiding stack exhaustion for recursive programs
More details at https://PysoniQ.com:
A video demonstration
Try out the point-and-click interface at the “Try PysoniQ” link
A detailed Project Overview
Technical FAQs
Blog and speed metric links for deeper analysis of the technologies
Downloadable PDFs – see the Resources link
Any comments from the Python community on this project would be most appreciated!
Thank you.
Mark mark@pysoniq.com _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/H43LXL... Code of Conduct: http://python.org/psf/codeofconduct/
Hi, Anders, The availability of a free extension every 30 days is a big benefit to the Python community that may not be immediately obvious. That’s not your standard freemium, as it has all the “features” of the paid product -- full registers, multicore, SIMD and other optimizations – so when we say it’s $600 per year of advanced software, that’s true. Our view is that the free extension every 30 days can make a huge difference to a developer with funding limitations (like us). If that free extension makes their project much more successful, then the entire Python community benefits. We considered this as an open source project, but we haven’t done that for two reasons: We have looked for and not found a large enough community of volunteers who have the skills to translate Python directly to assembly language without intermediate representation, and optimize the instructions, to make it open source. Open source projects are often very underfunded and don’t have enough volunteers even from a larger pool of possible people. For example, at PyCon 2019 Victor Stinner eloquently discussed the funding problems at python.org – a shrinking volunteer base and growing issues list. If I am wrong and there is a large enough group with the requisite skills, then of course we’re very open to the idea of open source, but the technologies used are very leading edge. And again, if you view it with nuance, the $600 a year (12 extensions) could make a huge difference to an under-funded project, of which there are many! Mark
I forgot to mention that I received private feedback that was useful with respect to the technical discussion: The core technologies used are discussed in detail in the first five blog entries -- that is the heart of the project. Mark
How is your approach different from, say, Cython, Nuitka or Pythran? Regards Antoine. On Sun, 08 Sep 2019 15:56:04 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote:
Hi, Anders,
The availability of a free extension every 30 days is a big benefit to the Python community that may not be immediately obvious. That’s not your standard freemium, as it has all the “features” of the paid product -- full registers, multicore, SIMD and other optimizations – so when we say it’s $600 per year of advanced software, that’s true. Our view is that the free extension every 30 days can make a huge difference to a developer with funding limitations (like us). If that free extension makes their project much more successful, then the entire Python community benefits.
We considered this as an open source project, but we haven’t done that for two reasons:
We have looked for and not found a large enough community of volunteers who have the skills to translate Python directly to assembly language without intermediate representation, and optimize the instructions, to make it open source.
Open source projects are often very underfunded and don’t have enough volunteers even from a larger pool of possible people. For example, at PyCon 2019 Victor Stinner eloquently discussed the funding problems at python.org – a shrinking volunteer base and growing issues list.
If I am wrong and there is a large enough group with the requisite skills, then of course we’re very open to the idea of open source, but the technologies used are very leading edge. And again, if you view it with nuance, the $600 a year (12 extensions) could make a huge difference to an under-funded project, of which there are many!
Mark _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VTIZHU... Code of Conduct: http://python.org/psf/codeofconduct/
Also PyPy and Numba. Cython actually seems a bit different. Without annotations in a superset language, Cython programs mostly just use the same CPython runtime libraries. However, with a few type annotations sprinkled in (but not actual Python syntax), it can get big speedups). PyPy actually tried to do direct-to-machine-code for a while. But my understanding is that they decided—as did Numba—that building on top of the work of LLVM was more effective (and more cross-architecture). Obviously, the ad for a commercial product leaves a bad taste in my mouth. But it's also not like there aren't already 5 or more open source projects that do a similar thing better already. On Sun, Sep 8, 2019, 12:03 PM Antoine Pitrou <solipsis@pitrou.net> wrote:
How is your approach different from, say, Cython, Nuitka or Pythran?
Regards
Antoine.
Hi, Anders,
The availability of a free extension every 30 days is a big benefit to
On Sun, 08 Sep 2019 15:56:04 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote: the Python community that may not be immediately obvious. That’s not your standard freemium, as it has all the “features” of the paid product -- full registers, multicore, SIMD and other optimizations – so when we say it’s $600 per year of advanced software, that’s true. Our view is that the free extension every 30 days can make a huge difference to a developer with funding limitations (like us). If that free extension makes their project much more successful, then the entire Python community benefits.
We considered this as an open source project, but we haven’t done that
for two reasons:
We have looked for and not found a large enough community of volunteers
who have the skills to translate Python directly to assembly language without intermediate representation, and optimize the instructions, to make it open source.
Open source projects are often very underfunded and don’t have enough
volunteers even from a larger pool of possible people. For example, at PyCon 2019 Victor Stinner eloquently discussed the funding problems at python.org – a shrinking volunteer base and growing issues list.
If I am wrong and there is a large enough group with the requisite
skills, then of course we’re very open to the idea of open source, but the technologies used are very leading edge. And again, if you view it with nuance, the $600 a year (12 extensions) could make a huge difference to an under-funded project, of which there are many!
Mark _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/VTIZHU...
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3WZZ7L... Code of Conduct: http://python.org/psf/codeofconduct/
For example mypyc does this. On Sun, Sep 8, 2019 at 17:29 David Mertz <mertz@gnosis.cx> wrote:
Also PyPy and Numba.
Cython actually seems a bit different. Without annotations in a superset language, Cython programs mostly just use the same CPython runtime libraries. However, with a few type annotations sprinkled in (but not actual Python syntax), it can get big speedups).
PyPy actually tried to do direct-to-machine-code for a while. But my understanding is that they decided—as did Numba—that building on top of the work of LLVM was more effective (and more cross-architecture).
Obviously, the ad for a commercial product leaves a bad taste in my mouth. But it's also not like there aren't already 5 or more open source projects that do a similar thing better already.
On Sun, Sep 8, 2019, 12:03 PM Antoine Pitrou <solipsis@pitrou.net> wrote:
How is your approach different from, say, Cython, Nuitka or Pythran?
Regards
Antoine.
Hi, Anders,
The availability of a free extension every 30 days is a big benefit to
On Sun, 08 Sep 2019 15:56:04 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote: the Python community that may not be immediately obvious. That’s not your standard freemium, as it has all the “features” of the paid product -- full registers, multicore, SIMD and other optimizations – so when we say it’s $600 per year of advanced software, that’s true. Our view is that the free extension every 30 days can make a huge difference to a developer with funding limitations (like us). If that free extension makes their project much more successful, then the entire Python community benefits.
We considered this as an open source project, but we haven’t done that
for two reasons:
We have looked for and not found a large enough community of volunteers
who have the skills to translate Python directly to assembly language without intermediate representation, and optimize the instructions, to make it open source.
Open source projects are often very underfunded and don’t have enough
volunteers even from a larger pool of possible people. For example, at PyCon 2019 Victor Stinner eloquently discussed the funding problems at python.org – a shrinking volunteer base and growing issues list.
If I am wrong and there is a large enough group with the requisite
skills, then of course we’re very open to the idea of open source, but the technologies used are very leading edge. And again, if you view it with nuance, the $600 a year (12 extensions) could make a huge difference to an under-funded project, of which there are many!
Mark _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/VTIZHU...
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3WZZ7L... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZGB6I2... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
Hi, Guido, I will distinguish PysoniQ from mypyc, based on what's available from mypyc at this time (at GitHub). Mypyc "compiles mypy-annotated, statically typed Python modules into CPython C extensions" whereas PysoniQ does not require any static typing. "Mypyc compiles what is essentially a Python language variant using 'strict' semantics" whereas PysoniQ is not a language variant, and has no special semantics. "Mypyc supports a subset of Python," whereas PysoniQ is not a subset and uses only standard CPython code, without alteration. Mypyc compiles to C and uses a C compiler, whereas PysoniQ translates directly to assembly language and requires no third-party compiler. Translating a single language directly to assembly gives the best optimizations because most of the commonly used compilers, like GCC, LLVM and Clang, use an intermediate language intended for many languages, and compiles to a number of target architectures. There can be a performance price for this, for which direct translation is the best solution. I don't see any metrics on the mypy page at GitHub, so I can't speak about any difference in speed. While PysoniQ only targets Intel/AMD x86-64 because it's by far the most dominant platform now, we plan to support ARMv8 as soon as possible (assuming it takes off). That means we need instructions for each architecture, but we want the best performance possible so we're prepared to do that when competing architectures like ARMv8 become viable. Regards, Mark
Mark @pysoniq wrote:
Translating a single language directly to assembly gives the best optimizations because most of the commonly used compilers, like GCC, LLVM and Clang, use an intermediate language intended for many languages, and compiles to a number of target architectures.
To take advantage of that, you need to find some optimisations that are made possible by the fact that you're compiling Python in particular to x86 in particular. That's something else that would be interesting to hear about. -- Greg
It very much sounds like marketing hype to repeat this "direct to assembly" thing so much. Essentially it's claiming they are better at writing optimizers that are the more numerous authors of GCC, LLVM, etc. That's not inconceivable, but it's a hold claim requiring strong evidence. Thanks, Antoine, for pointing me in right direction about PyPy. I knew they experimented with LLVM, but thought that avenue was more of a success. Indeed PyPy directly generates it's machine code, so maybe that approach is a good one. But PyPy also had a decade or more of effort behind it to get as good as it is (and it's open source) On Mon, Sep 9, 2019, 4:08 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Mark @pysoniq wrote:
Translating a single language directly to assembly gives the best optimizations because most of the commonly used compilers, like GCC, LLVM and Clang, use an intermediate language intended for many languages, and compiles to a number of target architectures.
To take advantage of that, you need to find some optimisations that are made possible by the fact that you're compiling Python in particular to x86 in particular. That's something else that would be interesting to hear about.
-- Greg _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FKYLBY... Code of Conduct: http://python.org/psf/codeofconduct/
I'm surprised no one has mentioned Psyco yet -- probably because it evolved into PyPy -- but IIRC, Psycho was pretty much the same as what the OP is talking about -- direct Python to machine code, and easy on the fly or ahead of time compilation. If I recall, the "magic" for dealing with a dynamic language was that it was a "specializing" compiler -- if you call a function with, e.g. two integers as input, it could compile a special version that only worked with two integers, and thus could be native fast. (Armin even argued that it *could* be faster than C :-) ). Not sure if this tool is doing anything like that, but it seems there are some pretty big limits as to what can be done without either a JIT (like PyPy), or type annotations (like Cython). -CHB On Mon, Sep 9, 2019 at 5:37 AM David Mertz <mertz@gnosis.cx> wrote:
It very much sounds like marketing hype to repeat this "direct to assembly" thing so much. Essentially it's claiming they are better at writing optimizers that are the more numerous authors of GCC, LLVM, etc. That's not inconceivable, but it's a hold claim requiring strong evidence.
Thanks, Antoine, for pointing me in right direction about PyPy. I knew they experimented with LLVM, but thought that avenue was more of a success. Indeed PyPy directly generates it's machine code, so maybe that approach is a good one. But PyPy also had a decade or more of effort behind it to get as good as it is (and it's open source)
On Mon, Sep 9, 2019, 4:08 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Mark @pysoniq wrote:
Translating a single language directly to assembly gives the best optimizations because most of the commonly used compilers, like GCC, LLVM and Clang, use an intermediate language intended for many languages, and compiles to a number of target architectures.
To take advantage of that, you need to find some optimisations that are made possible by the fact that you're compiling Python in particular to x86 in particular. That's something else that would be interesting to hear about.
-- Greg _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FKYLBY... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/CDDXA4... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Mon, 9 Sep 2019 11:04:22 -0700 Christopher Barker <pythonchb@gmail.com> wrote:
I'm surprised no one has mentioned Psyco yet -- probably because it evolved into PyPy -- but IIRC, Psycho was pretty much the same as what the OP is talking about -- direct Python to machine code, and easy on the fly or ahead of time compilation.
That's a good point.
If I recall, the "magic" for dealing with a dynamic language was that it was a "specializing" compiler -- if you call a function with, e.g. two integers as input, it could compile a special version that only worked with two integers, and thus could be native fast. (Armin even argued that it *could* be faster than C :-) ).
Not sure if this tool is doing anything like that, but it seems there are some pretty big limits as to what can be done without either a JIT (like PyPy), or type annotations (like Cython).
Psyco *was* a JIT, just a rather primitive one compared to other efforts such as PyPy. Also it was CPython-based, which led to various difficulties. Regards Antoine.
I have only read the posts on this thread, but the description sounded more like a AOT compiler (like Cython, Pythran, Nuitka) than a JIT compiler (like PyPy or Numba). Regards Antoine. PS : PyPy has its own codegen AFAIK, it doesn't use LLVM. On Sun, 8 Sep 2019 12:28:45 -0400 David Mertz <mertz@gnosis.cx> wrote:
Also PyPy and Numba.
Cython actually seems a bit different. Without annotations in a superset language, Cython programs mostly just use the same CPython runtime libraries. However, with a few type annotations sprinkled in (but not actual Python syntax), it can get big speedups).
PyPy actually tried to do direct-to-machine-code for a while. But my understanding is that they decided—as did Numba—that building on top of the work of LLVM was more effective (and more cross-architecture).
Obviously, the ad for a commercial product leaves a bad taste in my mouth. But it's also not like there aren't already 5 or more open source projects that do a similar thing better already.
On Sun, Sep 8, 2019, 12:03 PM Antoine Pitrou <solipsis-xNDA5Wrcr86sTnJN9+BGXg@public.gmane.org> wrote:
How is your approach different from, say, Cython, Nuitka or Pythran?
Regards
Antoine.
Hi, Anders,
The availability of a free extension every 30 days is a big benefit to
On Sun, 08 Sep 2019 15:56:04 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote: the Python community that may not be immediately obvious. That’s not your standard freemium, as it has all the “features” of the paid product -- full registers, multicore, SIMD and other optimizations – so when we say it’s $600 per year of advanced software, that’s true. Our view is that the free extension every 30 days can make a huge difference to a developer with funding limitations (like us). If that free extension makes their project much more successful, then the entire Python community benefits.
We considered this as an open source project, but we haven’t done that
for two reasons:
We have looked for and not found a large enough community of volunteers
who have the skills to translate Python directly to assembly language without intermediate representation, and optimize the instructions, to make it open source.
Open source projects are often very underfunded and don’t have enough
volunteers even from a larger pool of possible people. For example, at PyCon 2019 Victor Stinner eloquently discussed the funding problems at python.org – a shrinking volunteer base and growing issues list.
If I am wrong and there is a large enough group with the requisite
skills, then of course we’re very open to the idea of open source, but the technologies used are very leading edge. And again, if you view it with nuance, the $600 a year (12 extensions) could make a huge difference to an under-funded project, of which there are many!
Mark _______________________________________________ Python-ideas mailing list -- python-ideas-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org To unsubscribe send an email to python-ideas-leave-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at
https://mail.python.org/archives/list/python-ideas-+ZN9ApsXKcEdnm+yROfE0A@pu...
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org To unsubscribe send an email to python-ideas-leave-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas-+ZN9ApsXKcEdnm+yROfE0A@pu... Code of Conduct: http://python.org/psf/codeofconduct/
Hi, David, In several other posts here, I have distinguished PysoniQ from the open source projects mentioned. It has much better ease of use and faster published metrics than the other projects mentioned. The faster metrics should not be surprising when technologies like LLVM are cut out of the loop and you translate Python directly to assembly. The translate "directly" is critical for performance. When you say, "But it's also not like there aren't already 5 or more open source projects that do a similar thing better already" can you be specific and provide examples of how they are better? Thanks Mark
On 2019-09-08 11:17, Mark @pysoniq wrote:
Hi, David,
In several other posts here, I have distinguished PysoniQ from the open source projects mentioned. It has much better ease of use and faster published metrics than the other projects mentioned. The faster metrics should not be surprising when technologies like LLVM are cut out of the loop and you translate Python directly to assembly. The translate "directly" is critical for performance.
When you say, "But it's also not like there aren't already 5 or more open source projects that do a similar thing better already" can you be specific and provide examples of how they are better?
It's great that you are responding to all these nitpicks about what PysoniQ does and doesn't do and how it is or isn't better than other things. But I still haven't seen your response to Chris Angelico's question, which I'll reiterate here: what is your "idea"? You just keep posting about the features of PysoniQ without explaining why you've even brought it up on this list. What action are you intending anyone here to take? Are you planning for PysoniQ to be incorporated into core Python? -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
Hi, Brendan, In another post I said we considered making this open source, but our approach is new and unique and we don't think there is a rich volunteer community with the skills to translate Python directly to assembly language. I also talked about the unfortunate funding problems for open source projects -- including Python.org -- referring to Victor's PyCon presentation in 2019. It scared us about relying on open source funding to keep the project going without any real volunteer community. We plan to donate 5-10% of revenues to Python.org because we are very concerned about Python.org's funding problems, but we haven't said that on our site because it could sound presumptuous, if not arrogant, for a project just going into beta. The revenue sharing is similar to what we saw with JetBrains, where they recently shared 5% of revenue with Python.org for a 30-day period on sales of PyCharm. You asked why we posted this here for a product that is not open source. We wanted developer feedback, and so far we have gotten very useful feedback. We also wanted to see how it would be received by advanced Python developers. Your suggesting of incorporating it into CPython is very intriguing. We are open to all suggestions. Most the comments asked me to distinguish PysoniQ from the several available open source projects. I hope I have succeeded in answering those questions. There is a general implication that these are extravagant claims, but this project grew out of a previous project with APL translated directly to assembly, that involved more than 30,000 hours of development and testing. We never released it publicly because APL has a very small base. Regards, Mark
On Sun, 08 Sep 2019 18:17:17 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote:
Hi, David,
In several other posts here, I have distinguished PysoniQ from the open source projects mentioned. It has much better ease of use and faster published metrics than the other projects mentioned. The faster metrics should not be surprising when technologies like LLVM are cut out of the loop and you translate Python directly to assembly. The translate "directly" is critical for performance.
This kind of statement raises eyebrows. Nowadays compiler backends like LLVM are able to perform extremely potent optimizations (probably much more potent than any niche tool like pysoniq or Cython will ever manage on their own). Stating that the way you achieve good performance is by avoiding LLVM (or another high-level compiler backend such as gcc) and going straight to assembly is extremely peculiar, to say the least. Of course, you may have achieved a major breakthrough on your own and leapfrogged proven compiler technology. But it's reasonable to remain a bit skeptical at this point. Regards Antoine.
Hi, Antoine, Cython requires the end user to rewrite the module to be compiled in a pseudo-C language. With PysoniQ, there is no need to rewrite Python source code. Nutika compiles to a C program, and there is more workflow instrusion. With Nutkia, you will need to install a C compiler and go through a manual compilation step that is not necessary with PysoniQ. According to Nutika's website, Nutkia is "somewhat faster than CPython" and its current performance metric is 312% faster -- about 3 times faster. Our slowest metric is more than twice that, and our average unoptimized metric is more than 20 times faster than CPython. The average optimized metric is over 40 times faster. Nutkia also mentions dependency issues with Windows. PysoniQ has no dependencies. The metrics PysoniQ has achieved are largely due to Python-specific direct translation to assembly language without intermediate representation. Our goal was to give Python the fastest possible metrics with only minimal work for the end user. See the metrics at https://pysoniq.com/text_2.htm Finally, with both of these, you need to do more than just point and click. I will respond to Pythran in a moment. Mark
With all due respect, your description sounds more like marketing than actual technical data. You should provide a detailed technical of your solution, otherwise this is off-topic on this mailing-list. Also, performance numbers without a detailed description of what's exactly measured (including code for the benchmark) are useless. Finally, Cython does not require any rewriting, and annotations are optional. There is no need to spread misconceptions about your competitors. Regards Antoine. On Sun, 08 Sep 2019 16:51:51 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote:
Hi, Antoine,
Cython requires the end user to rewrite the module to be compiled in a pseudo-C language. With PysoniQ, there is no need to rewrite Python source code.
Nutika compiles to a C program, and there is more workflow instrusion. With Nutkia, you will need to install a C compiler and go through a manual compilation step that is not necessary with PysoniQ.
According to Nutika's website, Nutkia is "somewhat faster than CPython" and its current performance metric is 312% faster -- about 3 times faster. Our slowest metric is more than twice that, and our average unoptimized metric is more than 20 times faster than CPython. The average optimized metric is over 40 times faster.
Nutkia also mentions dependency issues with Windows. PysoniQ has no dependencies.
The metrics PysoniQ has achieved are largely due to Python-specific direct translation to assembly language without intermediate representation. Our goal was to give Python the fastest possible metrics with only minimal work for the end user. See the metrics at https://pysoniq.com/text_2.htm
Finally, with both of these, you need to do more than just point and click.
I will respond to Pythran in a moment.
Mark _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NB7BHH... Code of Conduct: http://python.org/psf/codeofconduct/
Antoine, In response to your comment: "Finally, Cython does not require any rewriting, and annotations are optional." With Cython the end user does need to modify the code by inserting C type definitions like this: def primes(int nb_primes): cdef int n, i, len_p cdef int p[1000] Mark
On Sun, 08 Sep 2019 17:20:30 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote:
Antoine,
In response to your comment: "Finally, Cython does not require any rewriting, and annotations are optional."
With Cython the end user does need to modify the code by inserting C type definitions like this:
No. Regards Antoine.
Most of the examples I have seen in the Cython documentation have a cdef header and many also have cdefs for typing. Although optional, here is a quote from "https://www.quora.com/How-fast-is-Cython" which indicates that typing will make a big difference in Cython performance: "Just simply converting to Cython is rarely much of a win. Some tight loops with little type checking might get a boost of tens of percents, but don't bet on it. Declaring the variables as cdefs around a loop can be a big change." Even in cases where cdef is optional, there are still manual compilation steps. You're probably familiar with that already, but here is from one part of the docs: "Now following the steps for the Hello World example we first rename the file to have a .pyx extension, lets say fib.pyx, then we create the setup.py file. Using the file created for the Hello World example, all that you need to change is the name of the Cython filename, and the resulting module name . . . ". The next step is to "build the extension with the same command used for the helloworld.pyx: $ python setup.py build_ext --inplace" PysoniQ does not require these manual steps, and does not need typings or cdef headers to get optimium speed. Thanks, Mark
Antoine, In response to "You should provide a detailed technical of your solution." The automatically created ctypes wrapper is one of the keys of the project. Blog entries 1 & 2 are a very detailed and technical discussion of the ctypes wrapper. If you go to Speed Metrics and read over the first entry -- Complex Calc -- you will find the Python source code as well as the assembly language output (Download pdf of Complex_Calc_asm). So you have the Python source. You have the 575 lines of NASM (assmbly language code). In addition, blog entries 3-5 discuss optimization in detail, and precisely how Complex_Calc was speeded up 63 times faster than Python. It is quite detailed and technical! Also, all of the other metrics have the Python source code. Mark
On Sun, 08 Sep 2019 17:27:27 -0000 "Mark @pysoniq" <mark@pysoniq.com> wrote:
Antoine,
In response to "You should provide a detailed technical of your solution."
The automatically created ctypes wrapper is one of the keys of the project. Blog entries 1 & 2 are a very detailed and technical discussion of the ctypes wrapper.
I don't think the ctypes wrapper in itself is very interesting. What is interesting is the underlying compilation technology, its assumptions and limitations. Also, when you mention blog posts or Web pages, please make it easier for the reader by providing hyperlinks. I don't know where to find those blog entries. Regards Antoine.
"I don't think the ctypes wrapper in itself is very interesting." Well, we disagree on that! I think that automatic generation of a ctypes wrapper to connect Python to assembly is interesting and a huge timesaver. "I don't know where to find those blog entries." The blogs can be reached directly at: https://pysoniq.com/text_16.htm and there is a link "Blog" on the home page. That link should light up when you go to the link I've just provided. If you read the 3rd & 4th entries -- Assembly Optimizations in Complex Calculations — Part 1 of 2 and Assembly Optimizations in Complex Calculations — Part 2 of 2, those blog posts contain a step-by-step breakdown of translating Python source to assembly language. And the NASM source is linked there and is also available at the Resources link, so you can follow along between the Python source and the assembly listing. Your questions are helpful! Mark
On Mon, Sep 9, 2019 at 4:12 AM Mark @pysoniq <mark@pysoniq.com> wrote:
"I don't think the ctypes wrapper in itself is very interesting."
Well, we disagree on that! I think that automatic generation of a ctypes wrapper to connect Python to assembly is interesting and a huge timesaver.
"I don't know where to find those blog entries."
The blogs can be reached directly at: https://pysoniq.com/text_16.htm and there is a link "Blog" on the home page. That link should light up when you go to the link I've just provided.
If you read the 3rd & 4th entries -- Assembly Optimizations in Complex Calculations — Part 1 of 2 and Assembly Optimizations in Complex Calculations — Part 2 of 2, those blog posts contain a step-by-step breakdown of translating Python source to assembly language. And the NASM source is linked there and is also available at the Resources link, so you can follow along between the Python source and the assembly listing.
Your questions are helpful!
Redirecting this to python-list as this is not discussing any actual Python language or interpreter ideas. Please remove python-ideas from cc for future replies. Your blog breaks the browser's Back button. Please don't do that; if you want people to read your content, make it easy for us to do so. How does your assembly translation cope with the possibility that "round(h, 2)" might not call the Python standard library round() function? You go to great lengths to show how you can optimize a specific type of calculation that appears to be specifically relating to floating-point math, including a number of steps that are basically the same as a peephole optimizer could do - or, for that matter, any programmer could easily do manually (I don't see a lot of loops in the real world that go "for a in n" and then "v0 = a" with no other use of a). What would happen if integers were involved? Remember, Python's integers can happily grow beyond any CPU register - will your transformation maintain the semantics of Python, or will it assume that everything is floating-point?
This line is self-explanatory. It takes the input array "n" and loops through each data point. The input array "n" is 64-bit double-precision floating point.
How do you ensure that this is an array of 64-bit floating-point values? ChrisA
Chris, In the ctypes wrapper we perform this test on entry: if (type(main_loop[0]) != float): #return an error message That's an imperfect test because lists can contain mixed types. To test an entire array would be a large performance penalty, so the developer must be sure that data passed in are all 64-bit float. Typically data passed into a ctypes wrapper are read from a data store because the arrays are large. When read from a data store, the developer specifies the data type like this: newarray = (ctypes.c_double * len(arr))(*arr). So that ensures all data are of the same type. Mark
I read those two blog posts, and found very few technical details. On Sun, Sep 8, 2019, 1:30 PM Mark @pysoniq <mark@pysoniq.com> wrote:
Antoine,
In response to "You should provide a detailed technical of your solution."
The automatically created ctypes wrapper is one of the keys of the project. Blog entries 1 & 2 are a very detailed and technical discussion of the ctypes wrapper.
If you go to Speed Metrics and read over the first entry -- Complex Calc -- you will find the Python source code as well as the assembly language output (Download pdf of Complex_Calc_asm). So you have the Python source. You have the 575 lines of NASM (assmbly language code). In addition, blog entries 3-5 discuss optimization in detail, and precisely how Complex_Calc was speeded up 63 times faster than Python. It is quite detailed and technical!
Also, all of the other metrics have the Python source code.
Mark _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HUFSAQ... Code of Conduct: http://python.org/psf/codeofconduct/
David, It would be most helpful if you could provide an example of how they contain "few technical details." I ask because I was afraid they were too technical! Thanks, Mark
Mark @pysoniq wrote:
It would be most helpful if you could provide an example of how they contain "few technical details." I ask because I was afraid they were too technical!
Your blog posts about optimisation mostly talk about standard things that have been known for decades. E.g. "keep variables in registers" -- yes, of course, any compiler worth its salt will try to do that. Also, it's all very low-level stuff that applies equally to any language. You've said nothing about how you approach the unique challenges of compiling Python, which stem from the fact that you're trying to bridge the gap between a very high-level programming model and a very low-level one. How do you deduce when the program is dealing with things like ints and floats that are amenable your low-level optimisations? Have you come up with some clever techniques that do a better job of this than other attempts, such as Nuitka? These are the kinds of details that would actually make people take an interest in your project. -- Greg
With respect to Pythran: Pythran is "for a subset of the Python language, with a focus on scientific computing." PysoniQ is not a subset of the Python language. Currently Pythran is for Python 2.7 and "has decent Python 3 support." PysoniQ is Python 3.x only, and goes through the most recent release of CPython. "Pythran depends on a few Python modules and several C++ libraries" which you will need to install manually. No installation is needed for PysoniQ. Pythran also requires some code rewriting. PysoniQ does not. With Pythran you also have to compile manually, but with PysoniQ that's not needed. That's a very brief summary of some of the differences between PysoniQ and Cython, Nutika and Pythran. Mark
On Sun, 8 Sep 2019 at 18:14, Mark @pysoniq <mark@pysoniq.com> wrote:
Pythran is "for a subset of the Python language, with a focus on scientific computing." PysoniQ is not a subset of the Python language.
So, to confirm, your product runs the full Python test suite without any errors, and can run the pyperformance benchmark suite (see https://pyperformance.readthedocs.io/), again without errors or limitations? Assuming so, would you post a full log of a run of pyperformance, to provide an independent confirmation of your claims? Better still, maybe you could donate a copy of your product so that it can be hosted on speed.python.org and compared with other Python implementations? Otherwise, I'm afraid your claims sound very much like marketing, and as with any such extravagant claims, my assumption until proved otherwise is that there are some undisclosed limitations that make the product a lot less attractive in reality than the claims imply. Paul
Paul, We are just going into beta, so we are not yet in a position to run the tests you mentioned. When we go to release 1.0, we will be able to do that. You can verify one of the metrics with the posted Complex Calc assembly code, which is a pdf at the Resources link. Shortly I will be blogging on the Six Second Video (which is listed in Speed Metrics link on the left side of the home page), and it will have the assembly listing as well. As far as limitations, the Technical FAQs cover as many limitations as we know of now. For example, under "classes" we mention that we can't do a derivitave class unless the base class is in the same .py file. We also mention that classes without the "self" word are not supported. We don't currently support print statements or async io. There are other limitations in the Tech FAQs, but hopefully they are not significant. To be clear, this is a project in development and I posted it here to get developer feedback specifically from developers who have a real need to speed up their project. For some projects, speed is not important. PysoniQ is not currently a commercial product, so this was not intended as an advertisement. I wanted to get technical feedback, bearing in mind that for reasons stated earlier it's not open source and the source is not published. As mentioned earlier, we are open to the idea of open source if we can find a large enough community of volunteers with the skills to translate Python directly to assembly language without using an intermediate representation or a third party compiler like LLVM. I apologize for any misunderstanding. Mark
On Mon, Sep 9, 2019 at 5:06 AM Mark @pysoniq <mark@pysoniq.com> wrote:
PysoniQ is not currently a commercial product, so this was not intended as an advertisement. I wanted to get technical feedback, bearing in mind that for reasons stated earlier it's not open source and the source is not published. As mentioned earlier, we are open to the idea of open source if we can find a large enough community of volunteers with the skills to translate Python directly to assembly language without using an intermediate representation or a third party compiler like LLVM.
It sounds like this is off-topic for python-ideas, but an excellent subject for discussion over on python-list. Can we redirect this conversation there? I'd love to discuss more about the optimizations and limitations there. ChrisA
On Sun, 8 Sep 2019 at 20:05, Mark @pysoniq <mark@pysoniq.com> wrote:
Paul,
We are just going into beta, so we are not yet in a position to run the tests you mentioned. When we go to release 1.0, we will be able to do that.
Your claims are (currently) inaccurate, then. I suggest that you avoid making them until you can demonstrate their accuracy. Otherwise, you absolutely give the impression of doing a commercial sales pitch, which is not likely to get a positive reaction here (as you've seen).
As far as limitations, the Technical FAQs cover as many limitations as we know of now. For example, under "classes" we mention that we can't do a derivitave class unless the base class is in the same .py file. We also mention that classes without the "self" word are not supported. We don't currently support print statements or async io.
Well, not being able to handle print statements seems like a major problem. Ignoring the fact that print is no longer a statement, but simply a builtin function, and assuming that you mean you don't support IO (because there's very little other interpretation I can give to you not supporting the print function), I can't see how I could usefully write a project without being able to use print()...
There are other limitations in the Tech FAQs, but hopefully they are not significant.
That's for the people you're asking to get involved to judge. Listing them in an easy to locate place (and providing a direct link to the listing - not forcing people to navigate your website) is hardly too much to ask, surely?
To be clear, this is a project in development and I posted it here to get developer feedback specifically from developers who have a real need to speed up their project. For some projects, speed is not important.
I have a hobby project where speed is important (Monte Carlo simulation of games of chance). But without print, you're of no use to me. Even with print, I expect end users to code their game rules using Python, so *precisely* what the limitations of your interpreter are directly affects my user interface. I certainly have no intention of testing my code under your implementation if doing so is only likely to help you deliver a closed source product that I (or my friends) would need to license to run my program.
PysoniQ is not currently a commercial product, so this was not intended as an advertisement. I wanted to get technical feedback, bearing in mind that for reasons stated earlier it's not open source and the source is not published. As mentioned earlier, we are open to the idea of open source if we can find a large enough community of volunteers with the skills to translate Python directly to assembly language without using an intermediate representation or a third party compiler like LLVM.
Your current approach doesn't seem well judged if you're interested in attracting a "community of volunteers" :-( From another post:
"Mypyc supports a subset of Python," whereas PysoniQ is not a subset and uses only standard CPython code, without alteration.
You keep saying this. OK. Here's a simple Python program. Do you compile it to correct assembly code that gives the right answer?
assert ((-80538738812075974)**3 + 80435758145817515**3 +12602123297335631**3) == 42
How about this one:
import math import sys def f(): return 42 ... sys.modules['math'].sin = f assert math.sin() == 42
If you want to claim that you don't support these "yet", when will you? Until you do, please stop claiming that your product "is not a subset". Paul
On Sun, Sep 8, 2019 at 10:47 PM Mark @pysoniq <mark@pysoniq.com> wrote:
Hello to the list,
I have an idea for Python that is non-traditional in that it doesn’t extend or modify existing Python language structure.
The idea uses Python to translate Python, entirely under program control, directly to optimized assembly language .dll or .so files, called “extensions.”
Extensions are called from Python using Python’s ctypes interface. The ctypes wrapper for each extension is created automatically.
The goal of this idea is for Python to perform as fast or faster than C or C++, without leaving Python.
Can you clarify here please: What exactly is the idea you're proposing? That CPython make use of your project for its own acceleration? That something be done in the language and/or interpreter to make your project more effective? If your project does already exist, there doesn't seem to be a proposal here - or else I'm just misreading this. ChrisA
I think the "proposal" is "people should give us money." :-) Yes, ads for commercial software do not belong on this list. On Sun, Sep 8, 2019, 12:08 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Sep 8, 2019 at 10:47 PM Mark @pysoniq <mark@pysoniq.com> wrote:
Hello to the list,
I have an idea for Python that is non-traditional in that it doesn’t
extend or modify existing Python language structure.
The idea uses Python to translate Python, entirely under program
control, directly to optimized assembly language .dll or .so files, called “extensions.”
Extensions are called from Python using Python’s ctypes interface. The
ctypes wrapper for each extension is created automatically.
The goal of this idea is for Python to perform as fast or faster than C
or C++, without leaving Python.
Can you clarify here please: What exactly is the idea you're proposing? That CPython make use of your project for its own acceleration? That something be done in the language and/or interpreter to make your project more effective?
If your project does already exist, there doesn't seem to be a proposal here - or else I'm just misreading this.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/JFMZEO... Code of Conduct: http://python.org/psf/codeofconduct/
participants (10)
-
Anders Hovmöller
-
Antoine Pitrou
-
Brendan Barnwell
-
Chris Angelico
-
Christopher Barker
-
David Mertz
-
Greg Ewing
-
Guido van Rossum
-
Mark @pysoniq
-
Paul Moore