[scikit-image] Re: ND requirements on new contributions

5 Aug 2019

      Hey everyone,
discussing some of Mark and Lars' points.
...
...
Personally, I think that the ND requirement is too big to impose on
contributions and on reviewers.
I agree with you here, Mark. If we always think "this code will have to
be nD-available to be accepted", we'll lose contributions/lots of dev
time in each contrib.
...
...
Often, people just need 2D operations.
(...) ND is actually very computationally expensive. In many
cases, people might have an information dense "2D image" and
...
slight
variations on other dimensions. This means that using
a full blown
2D algorithm is just in efficient on higher
dimensions (3D or Time,
or color).
IMHO, as a scientific package, this is not a sufficient argument for us
to get rid of nD in the algorithms. I guess there are people hoping
that we will implement specific nD things someday. If nD is hairy in
our minds, think that several of these algorithms won't pass 3D (+
color and/or time).
...
...
I think what is important is to keep the API forward looking for
ND. We can do things like accept tuples for parameters, instead of
"x_center", "y_center", or "r_center", "c_center". We can also
just throw errors when people pass in higher dimensions images
saying that it isn't implemented and contributions are welcome.
I think this is critical and really like this compromise. Don't do
more work than is actually needed while not making future work
harder. Equally critical to me is that whether a function supports
ND or only 2D is clearly documented. Maybe after a parameter's type
declaration in the docstring?
I guess these are all good solutions for us right now. I'm sure that
we'll break, fix and clean lots of stuff/code/API until we reach 1.0
(and that is okay™). This problems will be tackled naturally on the
way.
...
...
Optimizing for the different use cases is just tricky, and won't
be done correctly if contributors are asked to include something
they have no immediate need for.
We can help in some of these cases. If we (core group) can't, we can:
1. discuss if the algorithm is ready as-is to be included and upgraded
from there; 2. ask for external community help.
...
In that vein, wasn't there a discussion about writing a guide on
ND-algorithms (e.g. the raveled approach used in watershed,
local_maxima, flood_fill)? Was there any progress on this? I think
this could be really useful to the community and a good resource
for new contributors regardless of how skimage's policy on this
turns out.
+1. Or even, contributors aiming to enhance our 3D/nD possibilities,
only :)

Kind regards, and have y'all a nice week,

Alex

On Mon, 2019-08-05 at 09:49 +0200, Lars Grueter wrote:
...
On 04/08/2019 18:38, Mark Harfouche wrote:
...
There has been some discussion about having requirements on new
contributions being compatible with ND images. This has been
discussed
in a few PRs on several occasions and I think it warrants a
discussion
on the mailing list (though I'm the first to admit, I get too many
emails and seldom check mailing lists). Feel free to copy paste
this as
an issue on GitHub.
Personally, I think that the ND requirement is too big to impose on
contributions and on reviewers.
Often, people just need 2D operations.
I think what is important is to keep the API forward looking for
ND. We
can do things like accept tuples for parameters, instead of
"x_center",
"y_center", or "r_center", "c_center". We can also just throw
errors
when people pass in higher dimensions images saying that it isn't
implemented and contributions are welcome.
I think this is critical and really like this compromise. Don't do
more
work than is actually needed while not making future work harder.
Equally critical to me is that whether a function supports ND or only
2D
is clearly documented. Maybe after a parameter's type declaration in
the
docstring?
...
The second aspect is that ND is actually very computationally
expensive.
In many cases, people might have an information dense "2D image"
and
slight variations on other dimensions. This means that using a full
blown 2D algorithm is just in efficient on higher dimensions (3D or
Time, or color).
Optimizing for the different use cases is just tricky, and won't be
done
correctly if contributors are asked to include something they have
no
immediate need for.
I'm not sure I can follow you here. An ND-algorithm isn't inherently
slower for a 2D-problem than a 2D-algorithm. Agreed, it can be tricky
to
integrate optimizations and shortcuts for the 2D-case into an
ND-algorithm but it's certainly doable. I guess still a point in
favor
of allowing 2D-only. :)
Another approach is to keep two sub-functions: a general one for the
ND-case and an optimized one for the 2D-case. Although having two
algorithms for one problem may rarely be worth the maintenance cost.
In my limited experience the algorithmic difference between 1D and 2D
is
more significant than between 2D and ND.
In that vein, wasn't there a discussion about writing a guide on
ND-algorithms (e.g. the raveled approach used in watershed,
local_maxima, flood_fill)? Was there any progress on this? I think
this
could be really useful to the community and a good resource for new
contributors regardless of how skimage's policy on this turns out.
...
Finally, ND is just weird. Have a listen to this crazy video:
https://youtu.be/mceaM2_zQd8
:D My favorite explanation is this one: https://youtu.be/zwAD6dRSVyI
Lars
_______________________________________________
scikit-image mailing list -- scikit-image@python.org
To unsubscribe send an email to scikit-image-leave@python.org
--

Dr. Alexandre de Siqueira
Berkeley Institute for Data Science - BIDS
190 Doe Library
University of California, Berkeley
Berkeley, CA 94720
United States

Lattes CV: 3936721630855880
ORCID: 0000-0003-1320-4347
Github: alexandrejaguar
Skype: alexandrejaguar
--------------------------------------------------