On Fri, Mar 27, 2015 at 3:04 PM, Jaime Fernández del Río <jaime.frio@gmail.com> wrote:
On Fri, Mar 27, 2015 at 2:27 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:


On Thu, Mar 26, 2015 at 8:40 PM, AMAN singh <ug201310004@iitj.ac.in> wrote:
Thank you everyone for your insightful comments.
I have tried to incorporate your suggestion in the proposal.  Kindly have a look at the new proposal here and suggest the improvements.

Hi Aman, this looks quite good to me. For the timeline I think it will take longer to get the iterators right and shorter to port the last functions at the end - once you get the hang of it you'll be able to do the last ones quickly I expect.

That sounds about right. I think that breaking down the schedule to what function will be ported what week is little more than wishful thinking, and that keeping things at the file level would make more sense. But I think you are getting your proposal there.

One idea that just crossed my mind: checking your implementation of the iterators and other stuff in support.c for correctness and performance is going to be an important part of the project. Perhaps it is a good idea to identify, either now or very early on the project, a few current ndimage top level functions that use each of those objects, if possible without interaction with the others, and build a sequence that could look something like (I am making this up in a hurry, so don't take the actual function names proposed too seriously, although they may actually make sense):

Port NI_PointIterator -> Port NI_CenterOfMass, benchmark and test
Port NI_LineBuffer -> Port NI_UniformFilter1D, benchmark and test
...

This would very likely extend the time you will need to implement all the items in support.c. But by the time you were finished with that we would both have high confidence that things were going well, plus a "Rosetta Stone" that should make it a breeze to finish the job, both for you and anyone else. We would also have an intermediate milestone (everything in support ported plus a working example of each being used, with correctness and performance verified), that would be a worthy deliverable on its own: if we are terribly miscalculating task duration, and everything slips and is delayed, getting there could still be considered a success, since it would make finishing the job for others much, much simpler.

That sounds like an excellent idea to me.
 
One little concern of mine, and the questions don't really go to Aman, but to the scipy devs: the Cython docs on fused types have a big fat warning at the top on support still being experimental. Also, this is going to bump the version requirements for Cython to a very recent one. Are we OK with this?

We're using fused types in more places in Scipy now. They've been around for a while, and apart from that you have to be careful with using multiple usages of a fused type in a single function (which explodes the generated code and binary size) I don't remember many problems with it. Maybe worth asking the Cython devs why they haven't removed that warning yet?
 
Similarly, you suggest using Cython's prange to parallelize computations. I haven't seen OpenMP used anywhere in NumPy or SciPy, and have the feeling that parallel implementations are left out on purpose. Am I right, or would parallelizing were possible be OK?

Yep, that has been on purpose so far. That could change of course, but it would need significant discussion and an overall strategy first. OpenMP proposals for individual functions have always been rejected before. So would be better to remove it from this GSoC proposal.

Ralf