[pypy-dev] Questions on the pypy+numpy project

Mon Oct 17 22:18:32 CEST 2011

On Mon, Oct 17, 2011 at 8:40 PM, Maciej Fijalkowski <fijall at gmail.com> wrote:
> On Mon, Oct 17, 2011 at 7:20 PM, David Cournapeau <cournape at gmail.com> wrote:
>> On Mon, Oct 17, 2011 at 2:22 PM, Michael Foord <fuzzyman at gmail.com> wrote:
>>
>>>
>>> Travis' post seems to suggest that it is the responsibility of the *pypy*
>>> dev team to do the work necessary to integrate the numpy refactor (initially
>>> sponsored by Microsoft). That refactoring (smaller numpy core) seems like a
>>> great way forward for numpy - particularly if *it* wants to play well with
>>> multiple implementations, but it is unreasonable to expect the pypy team to
>>> pick that up!
>>
>> I am pretty sure Travis did not intend to suggest that (I did not
>> understand that from his wordings, but maybe that's because we had
>> discussion in person on that topic several times already).
>>
>> There are a lot of reasons to do that refactor that has nothing to do
>> with pypy, so the idea is more: let's talk about what pypy would need
>> to make this refactor beneficial for pypy *as well*. I (and other)
>> have advocated using more cython inside numpy and scipy. We could
>> share resources to do that.
>
> I think alex's question was whether the refactoring is going to be
> merged upstream or not (and what's the plan).

I don't know if the refactoring will be merged as is, but at least I
think the refactoring needs to happen, independently of pypy. There is
no denying that parts of numpy's code are crufty, some stuff not
clearly separated, etc...

> I don't think you understand our point.

I really do. I understand that pypy is a much better platform than
cpython to do lazy evalution, fast pure python ufunc. Nobody denies
that. To be even clearer: if the goal is to have some concept of array
which looks like numpy, then yes, using numpy's code is useless.

> Reusing the current numpy
> implementation is not giving us much *even* if it was all Cython and
> no C API.

This seems to be the source of the disagreement: I think reusing numpy
means that you are much more likely to be able to run the existing
scripts using numpy on top of pypy. So my question is whether the
disagreement is on the value of that, or whether pypy community
generally thinks they can rewrite a "numpypy" which is a drop-in
replacement of numpy on cpython without using original numpy's code.

> So, you're saying that giving people the ability to run numpy code
> faster if the refrain from using scipy and matplotlib (for now) is
> producing the community split? How does it? My interpretation is that
> we want to give people powerful tools that can be used to achieve
> things not possible before - like not using cython but instead
> implementing it in python. I imagine how someone might not get value
> from that, but how does that decrease the value?

It is not my place to questioning anyone's value, we all have our
different usages. But the split is obvious: you may have scientific
code which works on numpy+pypy and does not on numpy+python, and vice
and versa.

> I think our priority right now is to provide a working numpy. Next
> point is to make it use SSE. Does that fit somehow with your plan?

I guess there is an ambiguity in the exact meaning of "working numpy".
Something that looks like numpy with cool features from pypy, or
something that can be used as a drop-in of numpy (any script using
numpy will work with the numpy+pypy). If it is the former, than again,
I would agree that there is not much point in reusing numpy's code.
But then, I think calling it numpy is a bit confusing.

cheers,

David