Performance improvements via static typing

While I am aware of projects like Cython and mypy, it seems to make sense for CPython to allow optional enforcement of type hints, with compiler optimizations related to it to be used. While this would not receive the same level of performance benefits as using ctypes directly, there do appear to be various gains available here. My main concern with this as a thought was how to specify type hints as optional, as for maximum benefit, this shouldn't prevent the ability to type hint functions that you don't want to be treated in this manner. While I don't have an answer for that yet, I thought I'd toss the idea out there first. If it needs to be seen in action before deciding if it makes sense to add, I can work on a potential implementation soon, but right now, this is just an idea.

On Thu, Jul 19, 2018 at 6:52 AM Michael Hall <python-ideas@michaelhall.tech> wrote:
Just to make sure I understand: In other words, they would no longer be "hints" but "guarantees". This would allow an optimizer pass much greater latitude in code generation, somehow or other. For purposes of illustration (this is not a proposal, just for clarification): @guaranteed_types def my_sqrt(x:c_double) -> c_double: ... would tell the compiler that it's now possible to replace the general PyObject marshalling of this function with a pure-C one that only accepts doubles and woe be unto those who use it otherwise.

You are aware of numba? https://numba.pydata.org/ Stephan Op do 19 jul. 2018 16:03 schreef Eric Fahlgren <ericfahlgren@gmail.com>:

Michael Hall schrieb am 19.07.2018 um 15:51:
Well, first of all, a C level type check at runtime is quite fast compared to a byte code dispatch with stack argument handling, and can be done regardless of any type hints. There are various code patterns that would suggest a certain type ("x.append()" probably appends to a list, "x.get()" will likely be a dict lookup, etc.), and that can be optimised for without static type declarations and even without any type hints. Then, the mere fact that user code says "a: List" does not help in any way. Even "a: list" would not help, because any behaviour of that "list" might have been overridden by the list subtype that the function eventually receives. The same applies to "x: float". Here, in order to gain speed, the compiler would have to generate two separate code paths through the entire function, one for C double computations, and one for Python float subtypes. And then exponentiate that by the number of other typed arguments that may or may not contain subtypes. Quite some overhead. It's unclear if the gain would have any reasonable relation to the effort in the end. Sure, type hints could be used as a bare trigger for certain optimistic optimisations, but then, it's cheap enough to always apply these optimisations regardless, via a C type check. Stefan

A note here: Earlier in the conversation about standardizing type hinting, I (among others) was interested in applying it to C-level static typing (e.g. Cython). Guido made it very clear that that was NOT a goal of type hints — rather, they were to be used for dynamic, python style types — so a “list” is guaranteed to act like a list in python code, but not to have the same underlying binary representation. We could still use the same syntax for things like Cython, but they would have a different meaning. And a JIT compiler may not benefit from the hints at all, as it would have to check the actual type at run-time anyway. -CHB Sent from my iPhone

On Fri, Jul 20, 2018, 02:33 Stefan Behnel, <stefan_ml@behnel.de> wrote:
I'll also mention that my masters thesis disproved the benefit of type-specific opcodes for CPython: https://scholar.google.com/scholar?cluster=2007053175643839734&hl=en&as_sdt=0,5&sciodt=0,5 What this means is someone will have to demonstrate actual performance benefits before we talk about semantic changes or modifying the interpreter. IOW don't prematurely optimize for an optimization until it's actually been proven to be an optimization. 😉

On Thu, Jul 19, 2018 at 6:52 AM Michael Hall <python-ideas@michaelhall.tech> wrote:
Just to make sure I understand: In other words, they would no longer be "hints" but "guarantees". This would allow an optimizer pass much greater latitude in code generation, somehow or other. For purposes of illustration (this is not a proposal, just for clarification): @guaranteed_types def my_sqrt(x:c_double) -> c_double: ... would tell the compiler that it's now possible to replace the general PyObject marshalling of this function with a pure-C one that only accepts doubles and woe be unto those who use it otherwise.

You are aware of numba? https://numba.pydata.org/ Stephan Op do 19 jul. 2018 16:03 schreef Eric Fahlgren <ericfahlgren@gmail.com>:

Michael Hall schrieb am 19.07.2018 um 15:51:
Well, first of all, a C level type check at runtime is quite fast compared to a byte code dispatch with stack argument handling, and can be done regardless of any type hints. There are various code patterns that would suggest a certain type ("x.append()" probably appends to a list, "x.get()" will likely be a dict lookup, etc.), and that can be optimised for without static type declarations and even without any type hints. Then, the mere fact that user code says "a: List" does not help in any way. Even "a: list" would not help, because any behaviour of that "list" might have been overridden by the list subtype that the function eventually receives. The same applies to "x: float". Here, in order to gain speed, the compiler would have to generate two separate code paths through the entire function, one for C double computations, and one for Python float subtypes. And then exponentiate that by the number of other typed arguments that may or may not contain subtypes. Quite some overhead. It's unclear if the gain would have any reasonable relation to the effort in the end. Sure, type hints could be used as a bare trigger for certain optimistic optimisations, but then, it's cheap enough to always apply these optimisations regardless, via a C type check. Stefan

A note here: Earlier in the conversation about standardizing type hinting, I (among others) was interested in applying it to C-level static typing (e.g. Cython). Guido made it very clear that that was NOT a goal of type hints — rather, they were to be used for dynamic, python style types — so a “list” is guaranteed to act like a list in python code, but not to have the same underlying binary representation. We could still use the same syntax for things like Cython, but they would have a different meaning. And a JIT compiler may not benefit from the hints at all, as it would have to check the actual type at run-time anyway. -CHB Sent from my iPhone

On Fri, Jul 20, 2018, 02:33 Stefan Behnel, <stefan_ml@behnel.de> wrote:
I'll also mention that my masters thesis disproved the benefit of type-specific opcodes for CPython: https://scholar.google.com/scholar?cluster=2007053175643839734&hl=en&as_sdt=0,5&sciodt=0,5 What this means is someone will have to demonstrate actual performance benefits before we talk about semantic changes or modifying the interpreter. IOW don't prematurely optimize for an optimization until it's actually been proven to be an optimization. 😉
participants (6)
-
Brett Cannon
-
Chris Barker - NOAA Federal
-
Eric Fahlgren
-
Michael Hall
-
Stefan Behnel
-
Stephan Houben