[Numpy-discussion] Asking proposal review/feedback for GSOC 15

Julian Taylor jtaylor.debian at googlemail.com
Thu Mar 26 14:27:33 EDT 2015

On 03/26/2015 12:58 PM, Oğuzhan Ünlü wrote:
> Hi,
> Sorry for a bit late reply. I will express my thoughts for Ralf's
> suggestions, respectively.
>     Regarding your schedule:
>     - I would remove the parts related to benchmarks. There's no nice
>     benchmark
>     infrastructure in numpy itself at the moment (that's a separate GSoC
>     idea),
>     so the two times 1 week that you have are likely not enough to get
>     something off the ground there.
> - I think we can do a sample/demo benchmark only based on a library'
> speed performance over some basic set of data sets. Couldn't we? Instead
> of speed, it could be any other performance parameter, we can decide
> together.

Creating benchmark and performance tracking tools should not be part of
this project, but benchmarking is still important.

You may have to research learn how to best benchmark this low level
code, understand what influences their performance and we need a good
set of benchmarks so we know in the end what we have gained by this project.
I think the time allocation for this is good.

>     - The "implement a flexible interface" part will need some discussion,
>     probably it makes sense to first draft a document (call it a NEP - Numpy
>     Enhancement Proposal) that lays out the options and makes a proposal.
> To be realistic, I don't think I have enough time to complete an
> enhancement proposal. Maybe we can talk about it in the first half of
> April?  

I think he means writing the nep proposal should be part of the project,
you don't need to have a fleshed out one ready now. Though if you
already have some ideas on how the interface might look like this should
go into your proposal.

>     - I wouldn't put "investigate accuracy differences" at the end. What
>     if you
>     find out there that you've been working on something for the whole
>     summer
>     that's not accurate enough?
> However, we can't examine possible accuracy differences without having
> seen their real performance (in my case it is 'implementing an interface
> to libraries'). Isn't investigating possible libraries for numpy the
> fountain head of this project? Integrating chosen library can be
> possible by a small set of wrapping functions.

The accuracy of the libraries can be investigated prior to their
integration into numpy, and it should be done early to rule out or
de-prioritize bad options.
Documenting the trade-offs between performance and accuracy is one of
the most important tasks.

This also involves researching what kind of inputs to functions may be
numerically problematic which depending on your prior numerics knowledge
may take some time and should be accounted for.

>     - The "researching possible options" I would do in the community bonding
>     period - when the coding period starts you should have a fairly
>     well-defined plan.
> I agree with you at this point. After moving this to community bounding
> period, I can put a milestone like 'integrating chosen library to numpy'
> for 2 weeks. And we decide it would be better to remove benchmark part,
> then I would use that part for interface, probably. 
>     - 3 weeks for implementing the interface looks optimistic.
> It was an estimated time, I asked Julian's opinion about it and waiting
> his answer. You could be right, I am not familiar with codebase and
> exact set of functions to be improved. Since I prepared my schedule to
> serve as basis, I think it is understandable if something takes a bit
> longer or shorter as compared to what is written on schedule. 

I am pretty bad as estimating times, but I think the implementation of
interfaces can be done in three weeks if you are confident enough in
your C coding abilities and are some experienced in maneuvering foreign
code bases.

More information about the NumPy-Discussion mailing list