On Mon, Jun 26, 2017 at 9:45 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Jun 26, 2017 at 4:55 AM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
Hi all,
I have started sending PRs for ndimage. It is my intention to keep at it through the summer, and would like to provide some context on what I am trying to achieve.
A couple of years ago I mentored a GSoC to port ndimage to Cython that didn't do much progress. If nothing else, I think I did learn quite a bit about ndimage and what makes it hard to maintain. And I think one of the big ones is lack of encapsulation in the C code: ndimage defines four "classes" in ni_support.h <https://github.com/scipy/scipy/blob/master/scipy/ndimage/src/ni_support.h> that get used throughout, namely NI_Iterator, NI_LineBuffer, NI_FilterIterator and NI_CoordinateList.
Unfortunately they are not very self contained, and e.g. to instantiate a NI_FilterIterator you typically have to:
- declare a NI_FilterIterator variable, - declare two NI_Iterator variables, one for the input array, another for the output, - declare a npy_intp pointer of offsets and assign NULL to it, - call NI_InitFilterOffsets to initialize the offsets pointer, - call NI_InitFilterIterator to initialize the filter iterator, - call NI_InitPointIterator twice, once for the input, another for the output, - after each iteration call NI_FILTER_NEXT2 to advance all three iterators, - after iteration is finished, call free on the pointer of offsets.
There is no good reason why we couldn't refactor this to being more like:
- call NI_FilterIterator_New and assign its return to a NI_FilterIterator pointer, - after each iteration call NI_FilterIterator_Next to advance all involved pointers, - after iteration is finished, call NI_FilterIterator_Delete to release any memory.
Proper encapsulation would have many benefits:
- high level functions would not be cluttered with boilerplate, and would be easier to understand and follow, - chances for memory leaks and the like would be minimized, - we could wrap those "classes" in Python and test them thoroughly, - it would make the transition to Cython for the higher level functions, much simpler.
As an example, the lowest hanging fruit in this would be #7527 <https://github.com/scipy/scipy/pull/7527>.
So open source wise this is what I would like to spend my summer on.
Awesome! ndimage definitely could use it!
Any feedback is welcome, but I would especially like to hear about:
- thoughts on the overall plan,
Sounds like a great plan. I do think that the current test suite is a bit marginal for this exercise and review may not catch subtle issues, so it would be useful to: - use scikit-image as an extra testsuite regularly - ideally find a few heavy users to test the master branch once in a while
Good points, I have e-mailed the scikit-image list to make them aware of this and to ask them for help.
- use tools like a static code analyzer and Valgrind where it makes sense.
I may need help with setting up Valgrind: how hard is it ti make it work under OSX? We should also work on having a more complete test suite for the low level iterators, probably through Python wrappers.
- reviewer availability: I would like to make this as incremental as possible, but that means many smaller interdependent PRs, which require reviewer availability,
For the next 4 weeks or so my availability will be pretty good. I'm pretty sure that I don't know as much about ndimage as you do, but that's likely true for all other current devs as well:) I think it's important to keep up with your PRs; once we start getting too far behind in reviewing the effort only goes up. I suggest not being too modest in pinging for reviews of PRs that are going to be a bottleneck or result in conflicts later on.
Ralf
- if anyone wants to join in the fun, I'm more than happy to mentor!
Jaime
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. On Mon, Jun 26, 2017 at 9:45 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Jun 26, 2017 at 4:55 AM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
Hi all,
I have started sending PRs for ndimage. It is my intention to keep at it through the summer, and would like to provide some context on what I am trying to achieve.
A couple of years ago I mentored a GSoC to port ndimage to Cython that didn't do much progress. If nothing else, I think I did learn quite a bit about ndimage and what makes it hard to maintain. And I think one of the big ones is lack of encapsulation in the C code: ndimage defines four "classes" in ni_support.h <https://github.com/scipy/scipy/blob/master/scipy/ndimage/src/ni_support.h> that get used throughout, namely NI_Iterator, NI_LineBuffer, NI_FilterIterator and NI_CoordinateList.
Unfortunately they are not very self contained, and e.g. to instantiate a NI_FilterIterator you typically have to:
- declare a NI_FilterIterator variable, - declare two NI_Iterator variables, one for the input array, another for the output, - declare a npy_intp pointer of offsets and assign NULL to it, - call NI_InitFilterOffsets to initialize the offsets pointer, - call NI_InitFilterIterator to initialize the filter iterator, - call NI_InitPointIterator twice, once for the input, another for the output, - after each iteration call NI_FILTER_NEXT2 to advance all three iterators, - after iteration is finished, call free on the pointer of offsets.
There is no good reason why we couldn't refactor this to being more like:
- call NI_FilterIterator_New and assign its return to a NI_FilterIterator pointer, - after each iteration call NI_FilterIterator_Next to advance all involved pointers, - after iteration is finished, call NI_FilterIterator_Delete to release any memory.
Proper encapsulation would have many benefits:
- high level functions would not be cluttered with boilerplate, and would be easier to understand and follow, - chances for memory leaks and the like would be minimized, - we could wrap those "classes" in Python and test them thoroughly, - it would make the transition to Cython for the higher level functions, much simpler.
As an example, the lowest hanging fruit in this would be #7527 <https://github.com/scipy/scipy/pull/7527>.
So open source wise this is what I would like to spend my summer on.
Awesome! ndimage definitely could use it!
Any feedback is welcome, but I would especially like to hear about:
- thoughts on the overall plan,
Sounds like a great plan. I do think that the current test suite is a bit marginal for this exercise and review may not catch subtle issues, so it would be useful to: - use scikit-image as an extra testsuite regularly - ideally find a few heavy users to test the master branch once in a while - use tools like a static code analyzer and Valgrind where it makes sense.
- reviewer availability: I would like to make this as incremental as possible, but that means many smaller interdependent PRs, which require reviewer availability,
For the next 4 weeks or so my availability will be pretty good. I'm
pretty sure that I don't know as much about ndimage as you do, but that's likely true for all other current devs as well:) I think it's important to keep up with your PRs; once we start getting too far behind in reviewing the effort only goes up. I suggest not being too modest in pinging for reviews of PRs that are going to be a bottleneck or result in conflicts later on.
Ralf
- if anyone wants to join in the fun, I'm more than happy to mentor!
Jaime
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@python.org https://mail.python.org/mailman/listinfo/scipy-dev
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.