[pypy-dev] Questions on the pypy+numpy project

Tue Oct 18 11:20:04 CEST 2011

On Mon, Oct 17, 2011 at 10:18 PM, David Cournapeau <cournape at gmail.com> wrote:
> On Mon, Oct 17, 2011 at 8:40 PM, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> On Mon, Oct 17, 2011 at 7:20 PM, David Cournapeau <cournape at gmail.com> wrote:
>>> On Mon, Oct 17, 2011 at 2:22 PM, Michael Foord <fuzzyman at gmail.com> wrote:
>>>
>>>>
>>>> Travis' post seems to suggest that it is the responsibility of the *pypy*
>>>> dev team to do the work necessary to integrate the numpy refactor (initially
>>>> sponsored by Microsoft). That refactoring (smaller numpy core) seems like a
>>>> great way forward for numpy - particularly if *it* wants to play well with
>>>> multiple implementations, but it is unreasonable to expect the pypy team to
>>>> pick that up!
>>>
>>> I am pretty sure Travis did not intend to suggest that (I did not
>>> understand that from his wordings, but maybe that's because we had
>>> discussion in person on that topic several times already).
>>>
>>> There are a lot of reasons to do that refactor that has nothing to do
>>> with pypy, so the idea is more: let's talk about what pypy would need
>>> to make this refactor beneficial for pypy *as well*. I (and other)
>>> have advocated using more cython inside numpy and scipy. We could
>>> share resources to do that.
>>
>> I think alex's question was whether the refactoring is going to be
>> merged upstream or not (and what's the plan).
>
> I don't know if the refactoring will be merged as is, but at least I
> think the refactoring needs to happen, independently of pypy. There is
> no denying that parts of numpy's code are crufty, some stuff not
> clearly separated, etc...
>
>> I don't think you understand our point.
>
> I really do. I understand that pypy is a much better platform than
> cpython to do lazy evalution, fast pure python ufunc. Nobody denies
> that. To be even clearer: if the goal is to have some concept of array
> which looks like numpy, then yes, using numpy's code is useless.
>
>> Reusing the current numpy
>> implementation is not giving us much *even* if it was all Cython and
>> no C API.
>
> This seems to be the source of the disagreement: I think reusing numpy
> means that you are much more likely to be able to run the existing
> scripts using numpy on top of pypy. So my question is whether the
> disagreement is on the value of that, or whether pypy community
> generally thinks they can rewrite a "numpypy" which is a drop-in
> replacement of numpy on cpython without using original numpy's code.
>

Ok

Reusing numpy is maybe more likely to run the existing code indeed,
but we'll take care to be compatible (same with Python as a language
actually).

Reusing the CPython C API parts of numpy however does mean that we
nullify all the good parts of pypy - this is entirely pointless from
my perspective. I can't see how you can get both JIT running nicely
and reuse most of numpy. You have to sacrifice something and I would
be willing to sacrifice code reuse.

Indeed you would end up with two numpy implementations but it's not
like numpy is changing that much after all. We can provide a cython or
some sort of API to integrate with the existing legacy code later, but
the point stays - I can't see the plan of using cool parts of pypy and
numpy together. This is the question of what is harder - writing a
reasonable JIT or writing numpy. I would say numpy and you guys seems
to say JIT.

Cheers,
fijal