Hey everyone, discussing some of Mark and Lars' points.
Personally, I think that the ND requirement is too big to impose on contributions and on reviewers.
I agree with you here, Mark. If we always think "this code will have to be nD-available to be accepted", we'll lose contributions/lots of dev time in each contrib.
Often, people just need 2D operations.
(...) ND is actually very computationally expensive. In many
cases, people might have an information dense "2D image" and
slight variations on other dimensions. This means that using a full blown 2D algorithm is just in efficient on higher dimensions (3D or Time, or color).
IMHO, as a scientific package, this is not a sufficient argument for us to get rid of nD in the algorithms. I guess there are people hoping that we will implement specific nD things someday. If nD is hairy in our minds, think that several of these algorithms won't pass 3D (+ color and/or time).
I think what is important is to keep the API forward looking for ND. We can do things like accept tuples for parameters, instead of "x_center", "y_center", or "r_center", "c_center". We can also just throw errors when people pass in higher dimensions images saying that it isn't implemented and contributions are welcome.
I think this is critical and really like this compromise. Don't do more work than is actually needed while not making future work harder. Equally critical to me is that whether a function supports ND or only 2D is clearly documented. Maybe after a parameter's type declaration in the docstring?
I guess these are all good solutions for us right now. I'm sure that we'll break, fix and clean lots of stuff/code/API until we reach 1.0 (and that is okay™). This problems will be tackled naturally on the way.
Optimizing for the different use cases is just tricky, and won't be done correctly if contributors are asked to include something they have no immediate need for.
We can help in some of these cases. If we (core group) can't, we can: 1. discuss if the algorithm is ready as-is to be included and upgraded from there; 2. ask for external community help.
In that vein, wasn't there a discussion about writing a guide on ND-algorithms (e.g. the raveled approach used in watershed, local_maxima, flood_fill)? Was there any progress on this? I think this could be really useful to the community and a good resource for new contributors regardless of how skimage's policy on this turns out.
+1. Or even, contributors aiming to enhance our 3D/nD possibilities, only :) Kind regards, and have y'all a nice week, Alex On Mon, 2019-08-05 at 09:49 +0200, Lars Grueter wrote:
On 04/08/2019 18:38, Mark Harfouche wrote:
There has been some discussion about having requirements on new contributions being compatible with ND images. This has been discussed in a few PRs on several occasions and I think it warrants a discussion on the mailing list (though I'm the first to admit, I get too many emails and seldom check mailing lists). Feel free to copy paste this as an issue on GitHub.
Personally, I think that the ND requirement is too big to impose on contributions and on reviewers.
Often, people just need 2D operations.
I think what is important is to keep the API forward looking for ND. We can do things like accept tuples for parameters, instead of "x_center", "y_center", or "r_center", "c_center". We can also just throw errors when people pass in higher dimensions images saying that it isn't implemented and contributions are welcome.
I think this is critical and really like this compromise. Don't do more work than is actually needed while not making future work harder. Equally critical to me is that whether a function supports ND or only 2D is clearly documented. Maybe after a parameter's type declaration in the docstring?
The second aspect is that ND is actually very computationally expensive. In many cases, people might have an information dense "2D image" and slight variations on other dimensions. This means that using a full blown 2D algorithm is just in efficient on higher dimensions (3D or Time, or color).
Optimizing for the different use cases is just tricky, and won't be done correctly if contributors are asked to include something they have no immediate need for.
I'm not sure I can follow you here. An ND-algorithm isn't inherently slower for a 2D-problem than a 2D-algorithm. Agreed, it can be tricky to integrate optimizations and shortcuts for the 2D-case into an ND-algorithm but it's certainly doable. I guess still a point in favor of allowing 2D-only. :)
Another approach is to keep two sub-functions: a general one for the ND-case and an optimized one for the 2D-case. Although having two algorithms for one problem may rarely be worth the maintenance cost.
In my limited experience the algorithmic difference between 1D and 2D is more significant than between 2D and ND.
In that vein, wasn't there a discussion about writing a guide on ND-algorithms (e.g. the raveled approach used in watershed, local_maxima, flood_fill)? Was there any progress on this? I think this could be really useful to the community and a good resource for new contributors regardless of how skimage's policy on this turns out.
Finally, ND is just weird. Have a listen to this crazy video:
:D My favorite explanation is this one: https://youtu.be/zwAD6dRSVyI
Lars
_______________________________________________ scikit-image mailing list -- scikit-image@python.org To unsubscribe send an email to scikit-image-leave@python.org --
Dr. Alexandre de Siqueira Berkeley Institute for Data Science - BIDS 190 Doe Library University of California, Berkeley Berkeley, CA 94720 United States Lattes CV: 3936721630855880 ORCID: 0000-0003-1320-4347 Github: alexandrejaguar Skype: alexandrejaguar --------------------------------------------------