[SciPy-Dev] Refactoring ndimage

Jaime Fernández del Río jaime.frio at gmail.com
Mon Jun 26 05:36:27 EDT 2017


On Mon, Jun 26, 2017 at 9:45 AM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Mon, Jun 26, 2017 at 4:55 AM, Jaime Fernández del Río <
> jaime.frio at gmail.com> wrote:
>
>> Hi all,
>>
>> I have started sending PRs for ndimage. It is my intention to keep at it
>> through the summer, and would like to provide some context on what I am
>> trying to achieve.
>>
>> A couple of years ago I mentored a GSoC to port ndimage to Cython that
>> didn't do much progress. If nothing else, I think I did learn quite a bit
>> about ndimage and what makes it hard to maintain. And I think one of the
>> big ones is lack of encapsulation in the C code: ndimage defines four
>> "classes" in ni_support.h
>> <https://github.com/scipy/scipy/blob/master/scipy/ndimage/src/ni_support.h> that
>> get used throughout, namely NI_Iterator, NI_LineBuffer, NI_FilterIterator
>> and NI_CoordinateList.
>>
>> Unfortunately they are not very self contained, and e.g. to instantiate a
>> NI_FilterIterator you typically have to:
>>
>>    - declare a NI_FilterIterator variable,
>>    - declare two NI_Iterator variables, one for the input array, another
>>    for the output,
>>    - declare a npy_intp pointer of offsets and assign NULL to it,
>>    - call NI_InitFilterOffsets to initialize the offsets pointer,
>>    - call NI_InitFilterIterator to initialize the filter iterator,
>>    - call NI_InitPointIterator twice, once for the input, another for
>>    the output,
>>    - after each iteration call NI_FILTER_NEXT2 to advance all three
>>    iterators,
>>    - after iteration is finished, call free on the pointer of offsets.
>>
>> There is no good reason why we couldn't refactor this to being more like:
>>
>>    - call NI_FilterIterator_New and assign its return to a
>>    NI_FilterIterator pointer,
>>    - after each iteration call NI_FilterIterator_Next to advance all
>>    involved pointers,
>>    - after iteration is finished, call NI_FilterIterator_Delete to
>>    release any memory.
>>
>> Proper encapsulation would have many benefits:
>>
>>    - high level functions would not be cluttered with boilerplate, and
>>    would be easier to understand and follow,
>>    - chances for memory leaks and the like would be minimized,
>>    - we could wrap those "classes" in Python and test them thoroughly,
>>    - it would make the transition to Cython for the higher level
>>    functions, much simpler.
>>
>> As an example, the lowest hanging fruit in this would be #7527
>> <https://github.com/scipy/scipy/pull/7527>.
>>
>> So open source wise this is what I would like to spend my summer on.
>>
>
> Awesome! ndimage definitely could use it!
>
>
>> Any feedback is welcome, but I would especially like to hear about:
>>
>>    - thoughts on the overall plan,
>>
>>
> Sounds like a great plan. I do think that the current test suite is a bit
> marginal for this exercise and review may not catch subtle issues, so it
> would be useful to:
> - use scikit-image as an extra testsuite regularly
> - ideally find a few heavy users to test the master branch once in a while
>

Good points, I have e-mailed the scikit-image list to make them aware of
this and to ask them for help.


> - use tools like a static code analyzer and Valgrind where it makes sense.
>

I may need help with setting up Valgrind: how hard is it ti make it work
under OSX?

We should also work on having a more complete test suite for the low level
iterators, probably through Python wrappers.


>
>>    - reviewer availability: I would like to make this as incremental as
>>    possible, but that means many smaller interdependent PRs, which require
>>    reviewer availability,
>>
>> For the next 4 weeks or so my availability will be pretty good. I'm
> pretty sure that I don't know as much about ndimage as you do, but that's
> likely true for all other current devs as well:) I think it's important to
> keep up with your PRs; once we start getting too far behind in reviewing
> the effort only goes up. I suggest not being too modest in pinging for
> reviews of PRs that are going to be a bottleneck or result in conflicts
> later on.
>
> Ralf
>
>
>
>>
>>    - if anyone wants to join in the fun, I'm more than happy to mentor!
>>
>> Jaime
>>
>> --
>> (\__/)
>> ( O.o)
>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
>> de dominación mundial.
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.

On Mon, Jun 26, 2017 at 9:45 AM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
>
> On Mon, Jun 26, 2017 at 4:55 AM, Jaime Fernández del Río <
> jaime.frio at gmail.com> wrote:
>
>> Hi all,
>>
>> I have started sending PRs for ndimage. It is my intention to keep at it
>> through the summer, and would like to provide some context on what I am
>> trying to achieve.
>>
>> A couple of years ago I mentored a GSoC to port ndimage to Cython that
>> didn't do much progress. If nothing else, I think I did learn quite a bit
>> about ndimage and what makes it hard to maintain. And I think one of the
>> big ones is lack of encapsulation in the C code: ndimage defines four
>> "classes" in ni_support.h
>> <https://github.com/scipy/scipy/blob/master/scipy/ndimage/src/ni_support.h> that
>> get used throughout, namely NI_Iterator, NI_LineBuffer, NI_FilterIterator
>> and NI_CoordinateList.
>>
>> Unfortunately they are not very self contained, and e.g. to instantiate a
>> NI_FilterIterator you typically have to:
>>
>>    - declare a NI_FilterIterator variable,
>>    - declare two NI_Iterator variables, one for the input array, another
>>    for the output,
>>    - declare a npy_intp pointer of offsets and assign NULL to it,
>>    - call NI_InitFilterOffsets to initialize the offsets pointer,
>>    - call NI_InitFilterIterator to initialize the filter iterator,
>>    - call NI_InitPointIterator twice, once for the input, another for
>>    the output,
>>    - after each iteration call NI_FILTER_NEXT2 to advance all three
>>    iterators,
>>    - after iteration is finished, call free on the pointer of offsets.
>>
>> There is no good reason why we couldn't refactor this to being more like:
>>
>>    - call NI_FilterIterator_New and assign its return to a
>>    NI_FilterIterator pointer,
>>    - after each iteration call NI_FilterIterator_Next to advance all
>>    involved pointers,
>>    - after iteration is finished, call NI_FilterIterator_Delete to
>>    release any memory.
>>
>> Proper encapsulation would have many benefits:
>>
>>    - high level functions would not be cluttered with boilerplate, and
>>    would be easier to understand and follow,
>>    - chances for memory leaks and the like would be minimized,
>>    - we could wrap those "classes" in Python and test them thoroughly,
>>    - it would make the transition to Cython for the higher level
>>    functions, much simpler.
>>
>> As an example, the lowest hanging fruit in this would be #7527
>> <https://github.com/scipy/scipy/pull/7527>.
>>
>> So open source wise this is what I would like to spend my summer on.
>>
>
> Awesome! ndimage definitely could use it!
>
>
>> Any feedback is welcome, but I would especially like to hear about:
>>
>>    - thoughts on the overall plan,
>>
>>
> Sounds like a great plan. I do think that the current test suite is a bit
> marginal for this exercise and review may not catch subtle issues, so it
> would be useful to:
> - use scikit-image as an extra testsuite regularly
> - ideally find a few heavy users to test the master branch once in a while
> - use tools like a static code analyzer and Valgrind where it makes sense.
>
>>
>>    - reviewer availability: I would like to make this as incremental as
>>    possible, but that means many smaller interdependent PRs, which require
>>    reviewer availability,
>>
>> For the next 4 weeks or so my availability will be pretty good. I'm
> pretty sure that I don't know as much about ndimage as you do, but that's
> likely true for all other current devs as well:) I think it's important to
> keep up with your PRs; once we start getting too far behind in reviewing
> the effort only goes up. I suggest not being too modest in pinging for
> reviews of PRs that are going to be a bottleneck or result in conflicts
> later on.
>
> Ralf
>
>
>
>>
>>    - if anyone wants to join in the fun, I'm more than happy to mentor!
>>
>> Jaime
>>
>> --
>> (\__/)
>> ( O.o)
>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
>> de dominación mundial.
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20170626/f9147fc5/attachment.html>


More information about the SciPy-Dev mailing list