[Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy

Dag Sverre Seljebotn dagss at student.matnat.uio.no
Sat Apr 17 03:15:08 EDT 2010


Dag Sverre Seljebotn wrote:
> Dan Roberts wrote:
>> Hello NumPy Users,
>>     Hi everybody, my name is Dan Roberts, and my Google Summer of Code 
>> proposal was categorized under NumPy rather than PyPy, so it will end up 
>> being reviewed by mentors for the NumPy project.  I'd like to take this 
>> chance to introduce myself and my proposal.
>>     I hadn't prepared for review by the NumPy mentors, but this can make 
>> my proposal stronger than before.  With a bit of help from all of you, I 
>> can dedicate my summer to creating more useful code than I would have 
>> previously. I realize that from the perspective of NumPy, my proposal 
>> might seem lacking, so I'd like to also invite the scrutiny of all of 
>> the readers of this list.
>>     Why should we bother reimplimenting anything?  PyPy, for those who 
>> are unfamiliar, has the ability to Just-in-Time compile itself and 
>> programs that it's running.  One of the major advantages of this is that 
>> code operating on NumPy arrays could potentially be written in 
>> pure-python, with normal looping constructs, and be nearly as fast as a 
>> ufunc painstakingly crafted in C.  I'd love to see as much Python and as 
>> little C as possible, and I'm sure I'm not alone in that wish.
>>     A short introduction: I've been coding in Python for the past few 
>> years, and have increasingly become interested in speeding up what has 
>> become my favorite language. To that end I've become interested in both 
>> the PyPy project and the NumPy projects. I've spent a fair amount of 
>> time frustrating the PyPy developers with silly questions, written a bit 
>> of code for them, and now my GSoC proposal involves both them, and 
>> NumPy.    
>>     Finally, I'd like to ask all of you: what features are most 
>> important to you? It's not practical, wise, or even possible for me to 
>> reimpliment more than a small portion of NumPy, but if I can address the 
>> most important parts, maybe I can make this project useful enough for 
>> some of you to use, and close enough for the rest of you that I can drum 
>> up some support for more development in the future.
>>      My proposal lives at http://codespeak.net/~dan/gsoc/micronumpy.html 
>> thanks for making it this far through my long winded introduction!  I 
>> welcome all constructive criticism and thoughts.

WHOOPS!!

Looks like I'm making a fool of myself. I foolishly based my comments on 
an earlier reading of your proposal (I'm a PSF mentor), and didn't see 
(soon enough) that you had updated the proposal to answer just this 
question.

So please just ignore everything I've written :-)

Dag Sverre


> 
> I'm curious about what role natively compiled code in C would play in 
> your project. Would you use BLAS, or would you reimplement e.g. matrix 
> multiplication in RPython and hope that PyPy optimize it? (Hint: It 
> stands no chance of even coming close. A BLAS implementation is easily 
> 4-5 times faster (or more) than a simple hand-written C code for matrix 
> multiplication, which I assume is the lower bound for any RPython code 
> it is realistic to write. They use CPU-specific cache-aware algorithms 
> which you really can't hope to implement yourself.)
> 
> Eventually, for this to be at all useful for the NumPy crowd, one has to 
> make available eigenvalue finders, FFTs, and so on as well. This is a 
> massive amount of work unless one is willing to connect to existing C 
> implementations.
> 
> So even if all of this doesn't happen in the GSoC project, it would be 
> useful to know whether it is possible long-term to connect with BLAS and 
> LAPACK, or whether you intend everything to be done in RPython.
> 
> In my opinion, the *primary* reason Python is used for scientific 
> programming rather than some other language is how easy it is to connect 
> with C, C++ and Fortran code in CPython. That's something to keep in mind.
> 


-- 
Dag Sverre



More information about the NumPy-Discussion mailing list