Image class

Josh Warner silvertrumpet999 at gmail.com
Tue Sep 2 22:16:21 EDT 2014


I see this a little differently, looking at it from the viewpoint of a 
scientific user/new contributor.

One of the biggest reasons scikit-image has been able to grow so well, so 
rapidly, is because the bar for entry is low for new contributors. We 
operate as close to the data as possible. People using Python understand 
NumPy arrays. This is inherently familiar to those with backgrounds like 
Matlab, which is likely one of, if not the, largest population we serve. We 
want these people to be able to contribute without massive bars to entry.

If we used a custom wrapper class, even a light one, I for one would likely 
never have contributed to this package in the first place. Because, in 
order to understand what was even being fed in, I would have to dig up that 
wrapper, figure out what it actually contains, where my data is in it, 
think about how to react given different inputs, and so on. This is a 
tremendously daunting mental task. 

In short, moving to a new scheme would be, in my humble opinion, a 
disaster. Imagine having to explain the metadata structure of the 
hypothetical scikit-image construct to everyone at every SciPy Sprint, 
every time, before they could work with the codebase. Imagine users who 
want to know what the code is doing inspecting our source, seeing methods 
called which aren't standard NumPy methods, wondering what is going on, and 
having to dig through several other unclearly linked files in our repo to 
figure that out - if they even get that far. Many have tried to solve the 
image metadata/tagging problem with few real successes; either it ends up 
too extensible (no 1:1 mapping of tag to meaning, too much info) or too 
restrictive (whoops, no 3D or new modality). A perfect system gets devised 
at first, then edge cases become apparent, followed by workarounds, then 
someone comes up with a new system that includes everything to date, the 
API breaks to support it, repeat with new cases... if nothing else, this is 
a massive support headache.

>From the big-picture view, *we need to decide what we are, and what we want 
to become*. To date, scikit-image is a powerful library of tools. We don't 
purport to hold users' hands through the process of implementation, and we 
never have. Sometimes the tools are a bit rough around the edges, and they 
aren't always as easy to use as they could be. But, in return, they end up 
exposing the maximum amount of functionality to end users. At the risk of a 
very stretched metaphor, the current state is a bit like Linux.

Moving to towing around loads of metadata just so we can have complex, 
custom, branching paths behind every algorithm depending on what happens to 
be fed in - that's hand holding time. That's the Apple/OSX method. Yes, our 
package may gain a broader userbase and certain features in examples, etc. 
might be more elegant. But real world uses which don't follow the examples 
will be impeded, and more importantly we *will* alienate future potential 
contributors. 

I, for one, do not fancy implementing or maintaining the latter model. I 
fear for the future of the package even if we did implement it cleanly. 
There are aspects of the package where *limited use* of a custom wrapper 
would likely be reasonable, such as certain more public facing areas e.g. 
the Viewer, but I see these as the exception rather than the rule.

My opinions on this issue are strong, but I recognize I'm only one voice. 
The most important thing we need to reach consensus on is the path forward, 
as noted above in bold. Up until now it's been rather clear, but wrapping 
NumPy ndarrays would start us down a very different road, one on which I 
fear we could easily lose our way.

Josh

On Tuesday, September 2, 2014 3:52:34 PM UTC+1, Juan Nunez-Iglesias wrote:
>
> Cool, looks like there's lots of discussion to be had about this, but all 
> positive — I had expected more pushback, for some reason! =)
>
> @stefanv:
> > Do we have some use cases for this?
>
> Spacing and masks, as you mentioned.
>
> Spatial resolutions, so we can deal with images acquired at different 
> resolutions, e.g. registering light and electron microscopy images of the 
> same sample. Also, as in Tim's talk, to allow drawing of scale bars on 
> displayed images.
>
> Image orientation and absolute coordinates, which help with registration 
> and montaging.
>
> Image acquisition date, which could be important e.g. to generate time 
> series.
>
>  That's just off the top of my head, I'm sure there's more!
>
> @stefanv @tacaswell @tonysyu
> > Please find out how, then we can fix up the io.Image class!
>
> Tom's link <https://github.com/soft-matter/pims/blob/master/pims/frame.py> 
> to the solution in PIMS reminds me that it was indeed Dan who pointed out 
> that they'd solved the scalar issue. 
>
> @glyg
> > I would suggest relying on OME.XML
>
> I think that might be a little too heavy for our uses, but I might be 
> wrong. Either way, it's probably possible to have a compatibility layer 
> between XML and whatever use cases we want to follow. (e.g. Automatically 
> reading out "PhysicalSizeX/Y/Z" tags and converting those to a "spacing" 
> tag.)
>
> Juan.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20140902/37d1ea7f/attachment.html>


More information about the scikit-image mailing list