On Mar 1, 2018 06:20, "Lars G." <lagru@mailbox.org> wrote:

On 28.02.2018 16:04, Eric Larson wrote:
> For GSoC we need to ensure (at least) that the project fits 1) the needs
> of SciPy, 2) the GSoC program scope / timeline, 3) possible mentors, and
> 4) your goals. My sense is that a proposal based on code Cythonizing
> (with proper benchmark testing and regression protection) would be good
> for SciPy maintainability and could be crafted to have a reasonable
> scope. In terms of mentors, I feel comfortable mentoring changes to the
> `signal` module but not `ndimage`, so we'd need to find a qualified
> primary volunteer mentor if that ends up being the primary proposal
> direction.
Actually, considering that my background lies in electrical engineering
I'd be more than happy to focus on the `signal` module. And from the
other response it seems like cythonizing `ndimage` wouldn't be a good idea.

> Another thing to keep in mind is that the list of GSoC ideas is not
> meant to be exhaustive. So if you have some other ideas for SciPy
> functionality, feel free to throw those out for discussion as well. In
> my experience, genuine intrinsic enthusiasm for a project -- finding
> something you'd enjoy working on in your free time even if you weren't
> getting paid to do so -- can help make for successful GSoC applications
> and experiences.

So there would be enough candidates for Cythonization in `scipy.signal`
to fit the scope of GSoC? I myself can only guess where this would be
wanted and useful.

It doesn't have to be Cythonizing either. I'd be happy to add missing
functionality to the `signal` module or rework stuff that needs it.
The content in
https://docs.scipy.org/doc/scipy-1.0.0/reference/roadmap.html#signal
doesn't seem to be a good fit for a GSoC project. The only thing I can
think of right now is to extend the API for and add more adaptive filters:
https://en.wikipedia.org/wiki/Adaptive_filter
Again, I'm not sure this is wanted or if I'm judging the need correctly.

If you guys have any ideas or wishes in that direction I'd be happy to
hear them.

Best regards,
Lars

The first issue listed in the roadmap, convolution, is a much more complicated issue than that description makes out. There are a few issues, some with some overlap behind-the-scenes:

1. As discussed, there are a bunch of different implementations that that use different algorithm that work better in different scenarios. Ideally there would be one "master" function that would pick the best algorithm for a given set of parameters. This will depend on the number of dimensions to be convolved over, the size of the the first signal to be convolved, and the size of the second signal to be convolved. Changing any one of these can change which implementation is optimal, or even useful. So for with vectors, it is better to use a different algorithm if the one vector is short, if both vectors are long but one is much longer, and if both vectors are long and of similar length.

2. We don't have the best algorithms implemented for all of these scenarios. For example the "both vectors are long but one is much longer" scenario is best with the overlap-add algorithm, which scipy doesn't have. Similarly, there is an fft-based version of correlation equivalent to fftconvolve that isn't implemented, 2D and n-d versions of fft convolution and correlation that aren't implemented, etc.

3. The implementations only work over the number of dimensions they apply to. So the 1D implementations can only take vectors, the 2D implementations can only take 2D arrays, etc. There is no way to, say, apply a filter along the second dimension of a 3D signal. In order to implement the "master" function, at least one implementation (and ideally all implementations) should be able to be applied across additional dimensions.

And there is overlap between these. For example I mention the overlap-add method in point 2, but that would most likely be implemented in part by applying across dimensions as mentioned in point 3.

A lot of these issues apply elsewhere in scipy.signal. For example the stft/spectrogram uses a slow, naive implementation. A lot of the functions don't support applying across multidimensional arrays (for example to create a filter bank).