[pypy-dev] New Python/PyPy extension for object oriented numerically intensive codes ?

Yury V. Zaytsev yury at shurup.com
Wed Jan 6 16:56:39 EST 2021

On Wed, 6 Jan 2021, PIERRE AUGIER wrote:

> A big issue IMHO with Cython is that Cython code is not compatible with 
> Python and can't be interpreted. So we lose the advantage of an 
> interpreted language in term of development. One small change in this 
> big extension and one needs to recompile everything.

That's a valid point to a certain extent - however, in my experience, I 
was always somehow able to extract individual small functions in 
mini-modules and then I wrote some Makefile / setuptools glue to automate 
chained recompilation of all parts that changed, whenever I ran unit tests 
or command line interface so recompilation kept annoying me only until 
I've got the magic to work :-)

> For me, debugging is really harder (maybe because I'm not good at 
> debugging native codes). Moreover, actually one needs to know (a bit of) 
> C to write efficient Cython code so that it's difficult for some 
> contributors to understand/develop Cython extensions.

I must admit that I never needed to debug anything because I was doing TDD 
in the fist place, but probably you are right - debugging generated 
monster codes must be quite scary as compared to pure Python code with 
full IDE support like PyCharm.

Anyways, call me chauvinist, but I'd say it's just a sad fact of life that 
you need to know a thing or two about writing correct numeric low-level 
performance oriented code.

I assume you know it anyways and I'm sure that your worked up summation 
example below was just to make a completely different point, but as a 
matter of fact in your code the worst-case error grows proportionally to 
the number of elements in the vector (N) and RMS error grows 
proportionally to the square root of N for random inputs, so the results 
of your computations are going to be accordingly pretty random in the 
general case ;-)

Where I'm getting with this is that people who do this kind of stuff are 
somehow not bothered by Cython problems, and people who don't are 
rightfully bothered by valid issues, but if they are going to be helped, 
will it help their cause :-) ? Who knows...

On top of that, again, there is the whole MPI story. I used to write 
Python stuff that scaled to the hundreds of thousands of cores. I still 
did SIMD inside OpenMP threads on the local nodes on top of that just for 
kicks, but actually I could have achieved a factor of 4x speedup just by 
scheduling my jobs overnight with 4x cores instead and saved myself the 
trouble. But I wanted trouble, because it was fun :-)

Cython and mpi4py make MPI almost criminally easy on Python, so once you 
get this far, there comes the question - does 2x or 4x on the local node 
actually matter at all?

> So my questions are: Is it technically possible to extend Python and 
> PyPy to develop such extension and make it very efficient? Which tools 
> should be used? How should it be written?

It is absolutely technically possible and is a good idea in as far as I'm 
concerned, but I think that the challenge lies in developing conventions 
for semantics and getting people to accept them. I think that the zoo of 
various accelerators / compilers / boosters for Python only proves the 
point that this must be the hard part.

As for a backing buffer access mechanism, cffi is definitively a right 
tool - PyPy can already "see through" it as you've proven with your small 

Sincerely yours,
Yury V. Zaytsev

More information about the pypy-dev mailing list