Re: [Numpy-discussion] Asking proposal review/feedback for GSOC 15
Hi, Sorry for a bit late reply. I will express my thoughts for Ralf's suggestions, respectively.
Regarding your schedule: - I would remove the parts related to benchmarks. There's no nice benchmark infrastructure in numpy itself at the moment (that's a separate GSoC idea), so the two times 1 week that you have are likely not enough to get something off the ground there.
- I think we can do a sample/demo benchmark only based on a library' speed performance over some basic set of data sets. Couldn't we? Instead of speed, it could be any other performance parameter, we can decide together.
- The "implement a flexible interface" part will need some discussion, probably it makes sense to first draft a document (call it a NEP - Numpy Enhancement Proposal) that lays out the options and makes a proposal.
To be realistic, I don't think I have enough time to complete an enhancement proposal. Maybe we can talk about it in the first half of April? - I wouldn't put "investigate accuracy differences" at the end. What if you
find out there that you've been working on something for the whole summer that's not accurate enough?
However, we can't examine possible accuracy differences without having seen their real performance (in my case it is 'implementing an interface to libraries'). Isn't investigating possible libraries for numpy the fountain head of this project? Integrating chosen library can be possible by a small set of wrapping functions.
- The "researching possible options" I would do in the community bonding period - when the coding period starts you should have a fairly well-defined plan.
I agree with you at this point. After moving this to community bounding period, I can put a milestone like 'integrating chosen library to numpy' for 2 weeks. And we decide it would be better to remove benchmark part, then I would use that part for interface, probably.
- 3 weeks for implementing the interface looks optimistic.
It was an estimated time, I asked Julian's opinion about it and waiting his answer. You could be right, I am not familiar with codebase and exact set of functions to be improved. Since I prepared my schedule to serve as basis, I think it is understandable if something takes a bit longer or shorter as compared to what is written on schedule.
Cheers, Ralf
Your suggestions made me able think about project better. Thank you, Ralf. If you could share your opinions for my thoughts as well, I appreciate. My proposal is at https://gist.github.com/oguzhanunlu/1f8bf3ffc6ac5c420dd1 Cheers, Oguzhan
On 03/26/2015 12:58 PM, Oğuzhan Ünlü wrote:
Hi,
Sorry for a bit late reply. I will express my thoughts for Ralf's suggestions, respectively.
Regarding your schedule: - I would remove the parts related to benchmarks. There's no nice benchmark infrastructure in numpy itself at the moment (that's a separate GSoC idea), so the two times 1 week that you have are likely not enough to get something off the ground there.
- I think we can do a sample/demo benchmark only based on a library' speed performance over some basic set of data sets. Couldn't we? Instead of speed, it could be any other performance parameter, we can decide together.
Creating benchmark and performance tracking tools should not be part of this project, but benchmarking is still important. You may have to research learn how to best benchmark this low level code, understand what influences their performance and we need a good set of benchmarks so we know in the end what we have gained by this project. I think the time allocation for this is good.
- The "implement a flexible interface" part will need some discussion, probably it makes sense to first draft a document (call it a NEP - Numpy Enhancement Proposal) that lays out the options and makes a proposal.
To be realistic, I don't think I have enough time to complete an enhancement proposal. Maybe we can talk about it in the first half of April?
I think he means writing the nep proposal should be part of the project, you don't need to have a fleshed out one ready now. Though if you already have some ideas on how the interface might look like this should go into your proposal.
- I wouldn't put "investigate accuracy differences" at the end. What if you find out there that you've been working on something for the whole summer that's not accurate enough?
However, we can't examine possible accuracy differences without having seen their real performance (in my case it is 'implementing an interface to libraries'). Isn't investigating possible libraries for numpy the fountain head of this project? Integrating chosen library can be possible by a small set of wrapping functions.
The accuracy of the libraries can be investigated prior to their integration into numpy, and it should be done early to rule out or de-prioritize bad options. Documenting the trade-offs between performance and accuracy is one of the most important tasks. This also involves researching what kind of inputs to functions may be numerically problematic which depending on your prior numerics knowledge may take some time and should be accounted for.
- The "researching possible options" I would do in the community bonding period - when the coding period starts you should have a fairly well-defined plan.
I agree with you at this point. After moving this to community bounding period, I can put a milestone like 'integrating chosen library to numpy' for 2 weeks. And we decide it would be better to remove benchmark part, then I would use that part for interface, probably.
- 3 weeks for implementing the interface looks optimistic.
It was an estimated time, I asked Julian's opinion about it and waiting his answer. You could be right, I am not familiar with codebase and exact set of functions to be improved. Since I prepared my schedule to serve as basis, I think it is understandable if something takes a bit longer or shorter as compared to what is written on schedule.
I am pretty bad as estimating times, but I think the implementation of interfaces can be done in three weeks if you are confident enough in your C coding abilities and are some experienced in maneuvering foreign code bases.
participants (2)
-
Julian Taylor
-
Oğuzhan Ünlü