[pypy-dev] Differences performance Julia / PyPy on very similar codes
pierre.augier at univ-grenoble-alpes.fr
Sat Dec 26 17:23:14 EST 2020
----- Mail original -----
> De: "Carl Friedrich Bolz-Tereick" <cfbolz at gmx.de>
> À: "PIERRE AUGIER" <pierre.augier at univ-grenoble-alpes.fr>, "pypy-dev" <pypy-dev at python.org>
> Envoyé: Jeudi 24 Décembre 2020 07:06:43
> Objet: Re: [pypy-dev] Differences performance Julia / PyPy on very similar codes
> On 23.12.20 14:42, PIERRE AUGIER wrote:
>> I wrote another very simple benchmark that should not depend on
>> auto-vectorization. The bench function is:
>> def sum_x(positions):
>> result = 0.0
>> for i in range(len(positions)):
>> result += positions[i].x
>> return result
> This benchmark probably really shows the crux of the problem. In Python,
> the various Points instances (whether with lists, or with direct
> attributes) are vastly more complex beasts than the structs in Julia.
> There, you can declare a struct with a certain number of Float64 fields
> and be done. Thus, reading .x from such a struct is just a pointer
> In Python, due to dynamic typing, the ability to add more fields later
> and even the ability to change the class of an instance, the actual
> memory layout of a Point3D type is much more complex with various
> indirections and boxing. Reading .x out of such a thing is done in
> several steps:
> 1) check that positions[i] is an instance
> 2) check that it's an instance of Point3D
> 3) read its x field
> 4) check that the field is a float
> 5) read the float's value
> All of these steps involve a pointer read.
> Improving this situation is probably possible (there's even a paper how
> to get rid of steps 1 and 2:
> https://www.csl.cornell.edu/~cbatten/pdfs/cheng-type-freezing-cgo2020.pdf but
> the work wasn't merged). But there are problems:
> - basically every single one of these steps needs to be addressed, and
> every one is its own optimization
> - it's extremely delicate to get the balance and the trade-offs right,
> because the object system is so central in getting good performance for
> Python code across a wide variety of areas (not just numerical algorithms).
> Another approach would indeed be (as you say in the other mail) to add
> support for telling PyPy explicitly that some list can contain only
> instances of a specific class and (more importantly) that a class is not
> to be considered to be "dynamic" meaning that its fields are fixed and
> of specific types. So far, we have not really gone in such directions,
> because that is language design and we leave that to the CPython devs ;-).
Thanks a lot Carl for your very interesting answers.
I'm wondering if it could be possible to write an extension that would improve the situation for such numerical codes?
I wrote a first description here https://github.com/paugier/nbabel/blob/master/py/vector.md (more about the Python API).
I think that if something like this extension could exist and be very efficient with PyPy, it would greatly help writing very efficient numerical codes in "pure Python style". For the case of the NBabel problem, the code would be very nice and it seems to me that we could reach very good performance compared to Julia and other compiled languages. I would be very interested to get some feedback on this proposition.
Do you think that HPy could be used to implement such extension? Could such extension be fully compatible with PyPy JIT without modification in PyPy?
More information about the pypy-dev