Dear SciPy devs, I'm currently thinking about an application for this year's GSoC as well. As there already seems to be a large interest in the rotation formalism I'm trying to find another area that matches my interest and skill. I've dug up this proposal in scikit-image from GSoC 2015 https://github.com/scikit-image/scikit-image/wiki/GSoC-2015#rewriting-scipyn... and judging by the state of scipy/ndimage/src/ nobody has worked on this proposal yet (feel free to correct me). Alternatively I could imagine something similar for other sub-packages, e.g. scipy/signal which features many source files in C as well. So basically if there is an interest I could try to port C / Python code to Cython. What I would like to know: - Is there an interest? ;) - Is the original proposal in scikit-image still unfinished and are the potential mentors still interested in mentoring? - If there is a general interest to cythonize C or Python code during a GSoC project, which parts / sub-packages of SciPy would you priorize? As for my current involvement with SciPy: - I've already added a small function written in Cython https://github.com/scipy/scipy/pull/8350 - as part of a larger PR extending the signal module https://github.com/scipy/scipy/pull/8264 which will possibly merged this week. - I already cythonized slow parts of the above PR and plan to add these with new PRs after #8264 is merged. If this receives positive feedback I'd be happy to draft a more complete proposal / application based on the discussion around this. Best regards, Lars
- Is there an interest? ;)
I'd be happy to co-mentor on any relevant `scipy.signal` bits at least, as I've done some Cython work there. - Is the original proposal in scikit-image still unfinished and are the
potential mentors still interested in mentoring?
- If there is a general interest to cythonize C or Python code during a
GSoC project, which parts / sub-packages of SciPy would you priorize?
I don't know the answer to either of these questions -- I'll let others respond. If this receives positive feedback I'd be happy to draft a more complete
proposal / application based on the discussion around this.
For GSoC we need to ensure (at least) that the project fits 1) the needs of SciPy, 2) the GSoC program scope / timeline, 3) possible mentors, and 4) your goals. My sense is that a proposal based on code Cythonizing (with proper benchmark testing and regression protection) would be good for SciPy maintainability and could be crafted to have a reasonable scope. In terms of mentors, I feel comfortable mentoring changes to the `signal` module but not `ndimage`, so we'd need to find a qualified primary volunteer mentor if that ends up being the primary proposal direction. Another thing to keep in mind is that the list of GSoC ideas is not meant to be exhaustive. So if you have some other ideas for SciPy functionality, feel free to throw those out for discussion as well. In my experience, genuine intrinsic enthusiasm for a project -- finding something you'd enjoy working on in your free time even if you weren't getting paid to do so -- can help make for successful GSoC applications and experiences. Cheers, Eric
On 28.02.2018 16:04, Eric Larson wrote:
For GSoC we need to ensure (at least) that the project fits 1) the needs of SciPy, 2) the GSoC program scope / timeline, 3) possible mentors, and 4) your goals. My sense is that a proposal based on code Cythonizing (with proper benchmark testing and regression protection) would be good for SciPy maintainability and could be crafted to have a reasonable scope. In terms of mentors, I feel comfortable mentoring changes to the `signal` module but not `ndimage`, so we'd need to find a qualified primary volunteer mentor if that ends up being the primary proposal direction. Actually, considering that my background lies in electrical engineering I'd be more than happy to focus on the `signal` module. And from the other response it seems like cythonizing `ndimage` wouldn't be a good idea.
Another thing to keep in mind is that the list of GSoC ideas is not meant to be exhaustive. So if you have some other ideas for SciPy functionality, feel free to throw those out for discussion as well. In my experience, genuine intrinsic enthusiasm for a project -- finding something you'd enjoy working on in your free time even if you weren't getting paid to do so -- can help make for successful GSoC applications and experiences.
So there would be enough candidates for Cythonization in `scipy.signal` to fit the scope of GSoC? I myself can only guess where this would be wanted and useful. It doesn't have to be Cythonizing either. I'd be happy to add missing functionality to the `signal` module or rework stuff that needs it. The content in https://docs.scipy.org/doc/scipy-1.0.0/reference/roadmap.html#signal doesn't seem to be a good fit for a GSoC project. The only thing I can think of right now is to extend the API for and add more adaptive filters: https://en.wikipedia.org/wiki/Adaptive_filter Again, I'm not sure this is wanted or if I'm judging the need correctly. If you guys have any ideas or wishes in that direction I'd be happy to hear them. Best regards, Lars
Hey, Lars! I don't want to rob other potential ideas or mentors, but you mentioned that you think that rotation formalism idea is already for someone else. It is absolutely not the case, nothing is settled, and if you are interested in this subject --- I'm interested to see your ideas or a proposal (and Eric Larson likely as well). Best, Nikolay ---- On Thu, 01 Mar 2018 16:20:10 +0500 Lars G. <lagru@mailbox.org> wrote ---- On 28.02.2018 16:04, Eric Larson wrote: > For GSoC we need to ensure (at least) that the project fits 1) the needs > of SciPy, 2) the GSoC program scope / timeline, 3) possible mentors, and > 4) your goals. My sense is that a proposal based on code Cythonizing > (with proper benchmark testing and regression protection) would be good > for SciPy maintainability and could be crafted to have a reasonable > scope. In terms of mentors, I feel comfortable mentoring changes to the > `signal` module but not `ndimage`, so we'd need to find a qualified > primary volunteer mentor if that ends up being the primary proposal > direction. Actually, considering that my background lies in electrical engineering I'd be more than happy to focus on the `signal` module. And from the other response it seems like cythonizing `ndimage` wouldn't be a good idea. > Another thing to keep in mind is that the list of GSoC ideas is not > meant to be exhaustive. So if you have some other ideas for SciPy > functionality, feel free to throw those out for discussion as well. In > my experience, genuine intrinsic enthusiasm for a project -- finding > something you'd enjoy working on in your free time even if you weren't > getting paid to do so -- can help make for successful GSoC applications > and experiences. So there would be enough candidates for Cythonization in `scipy.signal` to fit the scope of GSoC? I myself can only guess where this would be wanted and useful. It doesn't have to be Cythonizing either. I'd be happy to add missing functionality to the `signal` module or rework stuff that needs it. The content in https://docs.scipy.org/doc/scipy-1.0.0/reference/roadmap.html#signal doesn't seem to be a good fit for a GSoC project. The only thing I can think of right now is to extend the API for and add more adaptive filters: https://en.wikipedia.org/wiki/Adaptive_filter Again, I'm not sure this is wanted or if I'm judging the need correctly. If you guys have any ideas or wishes in that direction I'd be happy to hear them. Best regards, Lars _______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
On 01.03.2018 14:13, Nikolay Mayorov wrote:
Hey, Lars!
I don't want to rob other potential ideas or mentors, but you mentioned that you think that rotation formalism idea is already for someone else. It is absolutely not the case, nothing is settled, and if you are interested in this subject --- I'm interested to see your ideas or a proposal (and Eric Larson likely as well).
Best, Nikolay
The topic does indeed sound interesting and from what it looks like you already have a pretty clear description of the scope, structure and goals. However I have never done any relevant programming in that area so I currently don't feel very confident that I'll be able to come up with a sensible API for that or make informed decisions. First, I'll see what comes of my first suggestions. In the meantime I will look through the linked references and see if I feel more confident afterwards. Best regards, Lars
On Mar 1, 2018 06:20, "Lars G." <lagru@mailbox.org> wrote:
On 28.02.2018 16:04, Eric Larson wrote:
For GSoC we need to ensure (at least) that the project fits 1) the needs of SciPy, 2) the GSoC program scope / timeline, 3) possible mentors, and 4) your goals. My sense is that a proposal based on code Cythonizing (with proper benchmark testing and regression protection) would be good for SciPy maintainability and could be crafted to have a reasonable scope. In terms of mentors, I feel comfortable mentoring changes to the `signal` module but not `ndimage`, so we'd need to find a qualified primary volunteer mentor if that ends up being the primary proposal direction. Actually, considering that my background lies in electrical engineering I'd be more than happy to focus on the `signal` module. And from the other response it seems like cythonizing `ndimage` wouldn't be a good idea.
Another thing to keep in mind is that the list of GSoC ideas is not meant to be exhaustive. So if you have some other ideas for SciPy functionality, feel free to throw those out for discussion as well. In my experience, genuine intrinsic enthusiasm for a project -- finding something you'd enjoy working on in your free time even if you weren't getting paid to do so -- can help make for successful GSoC applications and experiences.
So there would be enough candidates for Cythonization in `scipy.signal` to fit the scope of GSoC? I myself can only guess where this would be wanted and useful.
It doesn't have to be Cythonizing either. I'd be happy to add missing functionality to the `signal` module or rework stuff that needs it. The content in https://docs.scipy.org/doc/scipy-1.0.0/reference/roadmap.html#signal doesn't seem to be a good fit for a GSoC project. The only thing I can think of right now is to extend the API for and add more adaptive filters: https://en.wikipedia.org/wiki/Adaptive_filter Again, I'm not sure this is wanted or if I'm judging the need correctly.
If you guys have any ideas or wishes in that direction I'd be happy to hear them.
Best regards, Lars
The first issue listed in the roadmap, convolution, is a much more complicated issue than that description makes out. There are a few issues, some with some overlap behind-the-scenes: 1. As discussed, there are a bunch of different implementations that that use different algorithm that work better in different scenarios. Ideally there would be one "master" function that would pick the best algorithm for a given set of parameters. This will depend on the number of dimensions to be convolved over, the size of the the first signal to be convolved, and the size of the second signal to be convolved. Changing any one of these can change which implementation is optimal, or even useful. So for with vectors, it is better to use a different algorithm if the one vector is short, if both vectors are long but one is much longer, and if both vectors are long and of similar length. 2. We don't have the best algorithms implemented for all of these scenarios. For example the "both vectors are long but one is much longer" scenario is best with the overlap-add algorithm, which scipy doesn't have. Similarly, there is an fft-based version of correlation equivalent to fftconvolve that isn't implemented, 2D and n-d versions of fft convolution and correlation that aren't implemented, etc. 3. The implementations only work over the number of dimensions they apply to. So the 1D implementations can only take vectors, the 2D implementations can only take 2D arrays, etc. There is no way to, say, apply a filter along the second dimension of a 3D signal. In order to implement the "master" function, at least one implementation (and ideally all implementations) should be able to be applied across additional dimensions. And there is overlap between these. For example I mention the overlap-add method in point 2, but that would most likely be implemented in part by applying across dimensions as mentioned in point 3. A lot of these issues apply elsewhere in scipy.signal. For example the stft/spectrogram uses a slow, naive implementation. A lot of the functions don't support applying across multidimensional arrays (for example to create a filter bank).
On 01.03.2018 16:40, Todd wrote:
The first issue listed in the roadmap, convolution, is a much more complicated issue than that description makes out. There are a few issues, some with some overlap behind-the-scenes:
1. As discussed, there are a bunch of different implementations that that use different algorithm that work better in different scenarios. Ideally there would be one "master" function that would pick the best algorithm for a given set of parameters. This will depend on the number of dimensions to be convolved over, the size of the the first signal to be convolved, and the size of the second signal to be convolved. Changing any one of these can change which implementation is optimal, or even useful. So for with vectors, it is better to use a different algorithm if the one vector is short, if both vectors are long but one is much longer, and if both vectors are long and of similar length. 2. We don't have the best algorithms implemented for all of these scenarios. For example the "both vectors are long but one is much longer" scenario is best with the overlap-add algorithm, which scipy doesn't have. Similarly, there is an fft-based version of correlation equivalent to fftconvolve that isn't implemented, 2D and n-d versions of fft convolution and correlation that aren't implemented, etc. 3. The implementations only work over the number of dimensions they apply to. So the 1D implementations can only take vectors, the 2D implementations can only take 2D arrays, etc. There is no way to, say, apply a filter along the second dimension of a 3D signal. In order to implement the "master" function, at least one implementation (and ideally all implementations) should be able to be applied across additional dimensions.
And there is overlap between these. For example I mention the overlap-add method in point 2, but that would most likely be implemented in part by applying across dimensions as mentioned in point 3.
A lot of these issues apply elsewhere in scipy.signal. For example the stft/spectrogram uses a slow, naive implementation. A lot of the functions don't support applying across multidimensional arrays (for example to create a filter bank).
So you're saying this could be a possible GSoC project? Because this does sound the most interesting to me so far. To make sure I understand this correctly: - I would work with the two modules `signal` and `ndimage` as well as NumPy (`numpy.convolve`)? - I would unify, redesign and extend the parts / API that deal with convolution with the goal to cover the most common use cases and minimize overlap. - Is somebody willing to mentor this? - Required knowledge would involve understanding different algorithms to implement convolution as well as optimization, Python, Cython, C, ...? - How would you judge the size and difficulty of this task? Thank you all for the feedback so far. :) Best regards, Lars
On Sat, Mar 3, 2018 at 12:38 AM, Lars G. <lagru@mailbox.org> wrote:
On 01.03.2018 16:40, Todd wrote:
The first issue listed in the roadmap, convolution, is a much more complicated issue than that description makes out. There are a few issues, some with some overlap behind-the-scenes:
1. As discussed, there are a bunch of different implementations that that use different algorithm that work better in different scenarios. Ideally there would be one "master" function that would pick the best algorithm for a given set of parameters. This will depend on the number of dimensions to be convolved over, the size of the the first signal to be convolved, and the size of the second signal to be convolved. Changing any one of these can change which implementation is optimal, or even useful. So for with vectors, it is better to use a different algorithm if the one vector is short, if both vectors are long but one is much longer, and if both vectors are long and of similar length. 2. We don't have the best algorithms implemented for all of these scenarios. For example the "both vectors are long but one is much longer" scenario is best with the overlap-add algorithm, which scipy doesn't have. Similarly, there is an fft-based version of correlation equivalent to fftconvolve that isn't implemented, 2D and n-d versions of fft convolution and correlation that aren't implemented, etc. 3. The implementations only work over the number of dimensions they apply to. So the 1D implementations can only take vectors, the 2D implementations can only take 2D arrays, etc. There is no way to, say, apply a filter along the second dimension of a 3D signal. In order to implement the "master" function, at least one implementation (and ideally all implementations) should be able to be applied across additional dimensions.
And there is overlap between these. For example I mention the overlap-add method in point 2, but that would most likely be implemented in part by applying across dimensions as mentioned in point 3.
A lot of these issues apply elsewhere in scipy.signal. For example the stft/spectrogram uses a slow, naive implementation. A lot of the functions don't support applying across multidimensional arrays (for example to create a filter bank).
So you're saying this could be a possible GSoC project? Because this does sound the most interesting to me so far.
To make sure I understand this correctly:
- I would work with the two modules `signal` and `ndimage` as well as NumPy (`numpy.convolve`)? - I would unify, redesign and extend the parts / API that deal with convolution with the goal to cover the most common use cases and minimize overlap. - Is somebody willing to mentor this? - Required knowledge would involve understanding different algorithms to implement convolution as well as optimization, Python, Cython, C, ...? - How would you judge the size and difficulty of this task?
It would be a difficult project for GSOC, as a lot would depend on identifying and designing the underlying common algorithms, including APIs That makes it a step beyond just implementing something known in advance, it requires strong background knowledge and familiarity with the relevant SciPy modules. I would be hesitant to propose it unless it could be trimmed down to just one or two functions that are well defined before the project starts. Just as an example of how the complexity grows, NumPy convolution assumes finite sequences extended to +/- inf with zeros, whereas convolution used for interpolation and filtering will have a number of choices for edge conditions, some of which would be best handled with one of the discrete cos transforms. I don't think anyone has sat down and figured out how to organize all that, much less proposed a roadmap to implement it. Chuck
On 03.03.2018 16:05, Charles R Harris wrote:
It would be a difficult project for GSOC, as a lot would depend on identifying and designing the underlying common algorithms, including APIs That makes it a step beyond just implementing something known in advance, it requires strong background knowledge and familiarity with the relevant SciPy modules. I would be hesitant to propose it unless it could be trimmed down to just one or two functions that are well defined before the project starts. Just as an example of how the complexity grows, NumPy convolution assumes finite sequences extended to +/- inf with zeros, whereas convolution used for interpolation and filtering will have a number of choices for edge conditions, some of which would be best handled with one of the discrete cos transforms. I don't think anyone has sat down and figured out how to organize all that, much less proposed a roadmap to implement it.
Chuck
Okay, thanks for the warning. That doesn't sound promising. I think I'll try my luck with the other options. This makes me think as well, that the idea about cythonizing would suffer from similar problems. Best regards, Lars
On Wed, Feb 28, 2018 at 3:40 PM Lars G. <lagru@mailbox.org> wrote:
Dear SciPy devs,
I'm currently thinking about an application for this year's GSoC as well. As there already seems to be a large interest in the rotation formalism I'm trying to find another area that matches my interest and skill.
I've dug up this proposal in scikit-image from GSoC 2015
https://github.com/scikit-image/scikit-image/wiki/GSoC-2015#rewriting-scipyn... and judging by the state of scipy/ndimage/src/ nobody has worked on this proposal yet (feel free to correct me).
I mentored the not very successful, to put it mildly, GSoC 2015 project about cythonizing ndimage. From that experience, and further work afterwards, I no longer think Cython is the answer to ndimage's problems. The underlying C has a lot of very complicated code making lots of clever (often too clever for everyone's good) uses of pointer magic, that I honestly think are better kept in C. Or would at least need someone with a very deep understanding of both C and Cython, would much exceed the scope of a GSoC project, and I don't think I can commit to properly mentor such a project this summer. What would be nice is to replace the current nd_image.c file that implements the Python interface to the underlying C by a Cython implementation. That is not enough for a GSoC project, and it's not the most exciting thing to work on either. But if you want to put a full project together out of smaller, Cython related, subprojects, this could certainly be a part of it, and I wouldn't mind mentoring that subproject. Jaime
Alternatively I could imagine something similar for other sub-packages, e.g. scipy/signal which features many source files in C as well.
So basically if there is an interest I could try to port C / Python code to Cython. What I would like to know:
- Is there an interest? ;) - Is the original proposal in scikit-image still unfinished and are the potential mentors still interested in mentoring? - If there is a general interest to cythonize C or Python code during a GSoC project, which parts / sub-packages of SciPy would you priorize?
As for my current involvement with SciPy:
- I've already added a small function written in Cython https://github.com/scipy/scipy/pull/8350
P.S. Have submitted a small enhancement to your code, could you take a look? https://github.com/scipy/scipy/pull/8499
- as part of a larger PR extending the signal module https://github.com/scipy/scipy/pull/8264 which will possibly merged this week. - I already cythonized slow parts of the above PR and plan to add these with new PRs after #8264 is merged.
If this receives positive feedback I'd be happy to draft a more complete proposal / application based on the discussion around this.
Best regards, Lars
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
On Wed, Feb 28, 2018 at 9:11 AM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
On Wed, Feb 28, 2018 at 3:40 PM Lars G. <lagru@mailbox.org> wrote:
Dear SciPy devs,
I'm currently thinking about an application for this year's GSoC as well. As there already seems to be a large interest in the rotation formalism I'm trying to find another area that matches my interest and skill.
I've dug up this proposal in scikit-image from GSoC 2015 https://github.com/scikit-image/scikit-image/wiki/GSoC- 2015#rewriting-scipyndimage-in-cython and judging by the state of scipy/ndimage/src/ nobody has worked on this proposal yet (feel free to correct me).
I mentored the not very successful, to put it mildly, GSoC 2015 project about cythonizing ndimage. From that experience, and further work afterwards, I no longer think Cython is the answer to ndimage's problems. The underlying C has a lot of very complicated code making lots of clever (often too clever for everyone's good) uses of pointer magic, that I honestly think are better kept in C. Or would at least need someone with a very deep understanding of both C and Cython, would much exceed the scope of a GSoC project, and I don't think I can commit to properly mentor such a project this summer.
What would be nice is to replace the current nd_image.c file that implements the Python interface to the underlying C by a Cython implementation. That is not enough for a GSoC project, and it's not the most exciting thing to work on either. But if you want to put a full project together out of smaller, Cython related, subprojects, this could certainly be a part of it, and I wouldn't mind mentoring that subproject.
I think the spline bits could be vectorized and rewritten in Python without too much loss of speed. <snip> Chuck
On 28.02.2018 15:32, Lars G. wrote:
Dear SciPy devs,
I'm currently thinking about an application for this year's GSoC as well. As there already seems to be a large interest in the rotation formalism I'm trying to find another area that matches my interest and skill.
I've dug up this proposal in scikit-image from GSoC 2015 https://github.com/scikit-image/scikit-image/wiki/GSoC-2015#rewriting-scipyn... and judging by the state of scipy/ndimage/src/ nobody has worked on this proposal yet (feel free to correct me). Alternatively I could imagine something similar for other sub-packages, e.g. scipy/signal which features many source files in C as well.
So basically if there is an interest I could try to port C / Python code to Cython. What I would like to know:
- Is there an interest? ;) - Is the original proposal in scikit-image still unfinished and are the potential mentors still interested in mentoring? - If there is a general interest to cythonize C or Python code during a GSoC project, which parts / sub-packages of SciPy would you priorize?
As for my current involvement with SciPy:
- I've already added a small function written in Cython https://github.com/scipy/scipy/pull/8350 - as part of a larger PR extending the signal module https://github.com/scipy/scipy/pull/8264 which will possibly merged this week. - I already cythonized slow parts of the above PR and plan to add these with new PRs after #8264 is merged.
If this receives positive feedback I'd be happy to draft a more complete proposal / application based on the discussion around this.
Best regards, Lars
Actually, considering that GSoC should be treated as a full-time job during time of coding I must sadly pass on this. However I want to thank you all for the feedback already given. I hope its still useful for other potential applicants. Best regards, Lars
On Mon, Mar 5, 2018 at 3:02 AM, Lars G. <lagru@mailbox.org> wrote:
On 28.02.2018 15:32, Lars G. wrote:
Dear SciPy devs,
I'm currently thinking about an application for this year's GSoC as well. As there already seems to be a large interest in the rotation formalism I'm trying to find another area that matches my interest and skill.
I've dug up this proposal in scikit-image from GSoC 2015 https://github.com/scikit-image/scikit-image/wiki/GSoC- 2015#rewriting-scipyndimage-in-cython and judging by the state of scipy/ndimage/src/ nobody has worked on this proposal yet (feel free to correct me). Alternatively I could imagine something similar for other sub-packages, e.g. scipy/signal which features many source files in C as well.
So basically if there is an interest I could try to port C / Python code to Cython. What I would like to know:
- Is there an interest? ;) - Is the original proposal in scikit-image still unfinished and are the potential mentors still interested in mentoring? - If there is a general interest to cythonize C or Python code during a GSoC project, which parts / sub-packages of SciPy would you priorize?
As for my current involvement with SciPy:
- I've already added a small function written in Cython https://github.com/scipy/scipy/pull/8350 - as part of a larger PR extending the signal module https://github.com/scipy/scipy/pull/8264 which will possibly merged this week. - I already cythonized slow parts of the above PR and plan to add these with new PRs after #8264 is merged.
If this receives positive feedback I'd be happy to draft a more complete proposal / application based on the discussion around this.
Best regards, Lars
Actually, considering that GSoC should be treated as a full-time job during time of coding I must sadly pass on this. However I want to thank you all for the feedback already given. I hope its still useful for other potential applicants.
Thanks Lars. I hope you do stick around part-time! Cheers, Ralf
participants (7)
-
Charles R Harris
-
Eric Larson
-
Jaime Fernández del Río
-
Lars G.
-
Nikolay Mayorov
-
Ralf Gommers
-
Todd