[SciPy-Dev] Refactoring ndimage
Ralf Gommers
ralf.gommers at gmail.com
Mon Jun 26 07:17:54 EDT 2017
On Mon, Jun 26, 2017 at 9:36 PM, Jaime Fernández del Río <
jaime.frio at gmail.com> wrote:
>
> On Mon, Jun 26, 2017 at 9:45 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Mon, Jun 26, 2017 at 4:55 AM, Jaime Fernández del Río <
>> jaime.frio at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I have started sending PRs for ndimage. It is my intention to keep at it
>>> through the summer, and would like to provide some context on what I am
>>> trying to achieve.
>>>
>>> A couple of years ago I mentored a GSoC to port ndimage to Cython that
>>> didn't do much progress. If nothing else, I think I did learn quite a bit
>>> about ndimage and what makes it hard to maintain. And I think one of the
>>> big ones is lack of encapsulation in the C code: ndimage defines four
>>> "classes" in ni_support.h
>>> <https://github.com/scipy/scipy/blob/master/scipy/ndimage/src/ni_support.h> that
>>> get used throughout, namely NI_Iterator, NI_LineBuffer, NI_FilterIterator
>>> and NI_CoordinateList.
>>>
>>> Unfortunately they are not very self contained, and e.g. to instantiate
>>> a NI_FilterIterator you typically have to:
>>>
>>> - declare a NI_FilterIterator variable,
>>> - declare two NI_Iterator variables, one for the input array,
>>> another for the output,
>>> - declare a npy_intp pointer of offsets and assign NULL to it,
>>> - call NI_InitFilterOffsets to initialize the offsets pointer,
>>> - call NI_InitFilterIterator to initialize the filter iterator,
>>> - call NI_InitPointIterator twice, once for the input, another for
>>> the output,
>>> - after each iteration call NI_FILTER_NEXT2 to advance all three
>>> iterators,
>>> - after iteration is finished, call free on the pointer of offsets.
>>>
>>> There is no good reason why we couldn't refactor this to being more like:
>>>
>>> - call NI_FilterIterator_New and assign its return to a
>>> NI_FilterIterator pointer,
>>> - after each iteration call NI_FilterIterator_Next to advance all
>>> involved pointers,
>>> - after iteration is finished, call NI_FilterIterator_Delete to
>>> release any memory.
>>>
>>> Proper encapsulation would have many benefits:
>>>
>>> - high level functions would not be cluttered with boilerplate, and
>>> would be easier to understand and follow,
>>> - chances for memory leaks and the like would be minimized,
>>> - we could wrap those "classes" in Python and test them thoroughly,
>>> - it would make the transition to Cython for the higher level
>>> functions, much simpler.
>>>
>>> As an example, the lowest hanging fruit in this would be #7527
>>> <https://github.com/scipy/scipy/pull/7527>.
>>>
>>> So open source wise this is what I would like to spend my summer on.
>>>
>>
>> Awesome! ndimage definitely could use it!
>>
>>
>>> Any feedback is welcome, but I would especially like to hear about:
>>>
>>> - thoughts on the overall plan,
>>>
>>>
>> Sounds like a great plan. I do think that the current test suite is a bit
>> marginal for this exercise and review may not catch subtle issues, so it
>> would be useful to:
>> - use scikit-image as an extra testsuite regularly
>> - ideally find a few heavy users to test the master branch once in a while
>>
>
> Good points, I have e-mailed the scikit-image list to make them aware of
> this and to ask them for help.
>
>
>> - use tools like a static code analyzer and Valgrind where it makes sense.
>>
>
> I may need help with setting up Valgrind: how hard is it ti make it work
> under OSX?
>
I managed to set it up once on OS X, don't remember it being too painful.
Here's a recent recipe for it:
http://julianguyen.org/installing-valgrind-on-os-x-el-capitan/
>
> We should also work on having a more complete test suite for the low level
> iterators, probably through Python wrappers.
>
Makes sense.
Ralf
>
>
>>
>>> - reviewer availability: I would like to make this as incremental as
>>> possible, but that means many smaller interdependent PRs, which require
>>> reviewer availability,
>>>
>>> For the next 4 weeks or so my availability will be pretty good. I'm
>> pretty sure that I don't know as much about ndimage as you do, but that's
>> likely true for all other current devs as well:) I think it's important to
>> keep up with your PRs; once we start getting too far behind in reviewing
>> the effort only goes up. I suggest not being too modest in pinging for
>> reviews of PRs that are going to be a bottleneck or result in conflicts
>> later on.
>>
>> Ralf
>>
>>
>>
>>>
>>> - if anyone wants to join in the fun, I'm more than happy to mentor!
>>>
>>> Jaime
>>>
>>> --
>>> (\__/)
>>> ( O.o)
>>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>>> planes de dominación mundial.
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>>>
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>
>
> --
> (\__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
> de dominación mundial.
>
> On Mon, Jun 26, 2017 at 9:45 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>>
>> On Mon, Jun 26, 2017 at 4:55 AM, Jaime Fernández del Río <
>> jaime.frio at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I have started sending PRs for ndimage. It is my intention to keep at it
>>> through the summer, and would like to provide some context on what I am
>>> trying to achieve.
>>>
>>> A couple of years ago I mentored a GSoC to port ndimage to Cython that
>>> didn't do much progress. If nothing else, I think I did learn quite a bit
>>> about ndimage and what makes it hard to maintain. And I think one of the
>>> big ones is lack of encapsulation in the C code: ndimage defines four
>>> "classes" in ni_support.h
>>> <https://github.com/scipy/scipy/blob/master/scipy/ndimage/src/ni_support.h> that
>>> get used throughout, namely NI_Iterator, NI_LineBuffer, NI_FilterIterator
>>> and NI_CoordinateList.
>>>
>>> Unfortunately they are not very self contained, and e.g. to instantiate
>>> a NI_FilterIterator you typically have to:
>>>
>>> - declare a NI_FilterIterator variable,
>>> - declare two NI_Iterator variables, one for the input array,
>>> another for the output,
>>> - declare a npy_intp pointer of offsets and assign NULL to it,
>>> - call NI_InitFilterOffsets to initialize the offsets pointer,
>>> - call NI_InitFilterIterator to initialize the filter iterator,
>>> - call NI_InitPointIterator twice, once for the input, another for
>>> the output,
>>> - after each iteration call NI_FILTER_NEXT2 to advance all three
>>> iterators,
>>> - after iteration is finished, call free on the pointer of offsets.
>>>
>>> There is no good reason why we couldn't refactor this to being more like:
>>>
>>> - call NI_FilterIterator_New and assign its return to a
>>> NI_FilterIterator pointer,
>>> - after each iteration call NI_FilterIterator_Next to advance all
>>> involved pointers,
>>> - after iteration is finished, call NI_FilterIterator_Delete to
>>> release any memory.
>>>
>>> Proper encapsulation would have many benefits:
>>>
>>> - high level functions would not be cluttered with boilerplate, and
>>> would be easier to understand and follow,
>>> - chances for memory leaks and the like would be minimized,
>>> - we could wrap those "classes" in Python and test them thoroughly,
>>> - it would make the transition to Cython for the higher level
>>> functions, much simpler.
>>>
>>> As an example, the lowest hanging fruit in this would be #7527
>>> <https://github.com/scipy/scipy/pull/7527>.
>>>
>>> So open source wise this is what I would like to spend my summer on.
>>>
>>
>> Awesome! ndimage definitely could use it!
>>
>>
>>> Any feedback is welcome, but I would especially like to hear about:
>>>
>>> - thoughts on the overall plan,
>>>
>>>
>> Sounds like a great plan. I do think that the current test suite is a bit
>> marginal for this exercise and review may not catch subtle issues, so it
>> would be useful to:
>> - use scikit-image as an extra testsuite regularly
>> - ideally find a few heavy users to test the master branch once in a while
>> - use tools like a static code analyzer and Valgrind where it makes sense.
>>
>>>
>>> - reviewer availability: I would like to make this as incremental as
>>> possible, but that means many smaller interdependent PRs, which require
>>> reviewer availability,
>>>
>>> For the next 4 weeks or so my availability will be pretty good. I'm
>> pretty sure that I don't know as much about ndimage as you do, but that's
>> likely true for all other current devs as well:) I think it's important to
>> keep up with your PRs; once we start getting too far behind in reviewing
>> the effort only goes up. I suggest not being too modest in pinging for
>> reviews of PRs that are going to be a bottleneck or result in conflicts
>> later on.
>>
>> Ralf
>>
>>
>>
>>>
>>> - if anyone wants to join in the fun, I'm more than happy to mentor!
>>>
>>> Jaime
>>>
>>> --
>>> (\__/)
>>> ( O.o)
>>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>>> planes de dominación mundial.
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>>>
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>
>
> --
> (\__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
> de dominación mundial.
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20170626/3680b654/attachment-0001.html>
More information about the SciPy-Dev
mailing list