[Numpy-discussion] My GSoC Proposal to Implement a Subset of NumPy for PyPy
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Sat Apr 17 03:15:08 EDT 2010
Dag Sverre Seljebotn wrote:
> Dan Roberts wrote:
>> Hello NumPy Users,
>> Hi everybody, my name is Dan Roberts, and my Google Summer of Code
>> proposal was categorized under NumPy rather than PyPy, so it will end up
>> being reviewed by mentors for the NumPy project. I'd like to take this
>> chance to introduce myself and my proposal.
>> I hadn't prepared for review by the NumPy mentors, but this can make
>> my proposal stronger than before. With a bit of help from all of you, I
>> can dedicate my summer to creating more useful code than I would have
>> previously. I realize that from the perspective of NumPy, my proposal
>> might seem lacking, so I'd like to also invite the scrutiny of all of
>> the readers of this list.
>> Why should we bother reimplimenting anything? PyPy, for those who
>> are unfamiliar, has the ability to Just-in-Time compile itself and
>> programs that it's running. One of the major advantages of this is that
>> code operating on NumPy arrays could potentially be written in
>> pure-python, with normal looping constructs, and be nearly as fast as a
>> ufunc painstakingly crafted in C. I'd love to see as much Python and as
>> little C as possible, and I'm sure I'm not alone in that wish.
>> A short introduction: I've been coding in Python for the past few
>> years, and have increasingly become interested in speeding up what has
>> become my favorite language. To that end I've become interested in both
>> the PyPy project and the NumPy projects. I've spent a fair amount of
>> time frustrating the PyPy developers with silly questions, written a bit
>> of code for them, and now my GSoC proposal involves both them, and
>> NumPy.
>> Finally, I'd like to ask all of you: what features are most
>> important to you? It's not practical, wise, or even possible for me to
>> reimpliment more than a small portion of NumPy, but if I can address the
>> most important parts, maybe I can make this project useful enough for
>> some of you to use, and close enough for the rest of you that I can drum
>> up some support for more development in the future.
>> My proposal lives at http://codespeak.net/~dan/gsoc/micronumpy.html
>> thanks for making it this far through my long winded introduction! I
>> welcome all constructive criticism and thoughts.
WHOOPS!!
Looks like I'm making a fool of myself. I foolishly based my comments on
an earlier reading of your proposal (I'm a PSF mentor), and didn't see
(soon enough) that you had updated the proposal to answer just this
question.
So please just ignore everything I've written :-)
Dag Sverre
>
> I'm curious about what role natively compiled code in C would play in
> your project. Would you use BLAS, or would you reimplement e.g. matrix
> multiplication in RPython and hope that PyPy optimize it? (Hint: It
> stands no chance of even coming close. A BLAS implementation is easily
> 4-5 times faster (or more) than a simple hand-written C code for matrix
> multiplication, which I assume is the lower bound for any RPython code
> it is realistic to write. They use CPU-specific cache-aware algorithms
> which you really can't hope to implement yourself.)
>
> Eventually, for this to be at all useful for the NumPy crowd, one has to
> make available eigenvalue finders, FFTs, and so on as well. This is a
> massive amount of work unless one is willing to connect to existing C
> implementations.
>
> So even if all of this doesn't happen in the GSoC project, it would be
> useful to know whether it is possible long-term to connect with BLAS and
> LAPACK, or whether you intend everything to be done in RPython.
>
> In my opinion, the *primary* reason Python is used for scientific
> programming rather than some other language is how easy it is to connect
> with C, C++ and Fortran code in CPython. That's something to keep in mind.
>
--
Dag Sverre
More information about the NumPy-Discussion
mailing list