Image class

Johannes Schönberger jsch at demuc.de
Tue Sep 2 22:56:03 EDT 2014


I have not chimed in yet, but I can recall a discussion about the same topic a while ago. As back then, I agree with Josh, who has given some very good reasons. In addition, I would like to question the usefulness of the tightly coupled metadata in general.

> Alternatively, we might want to have a unified “metadata” dictionary format that would unwrap nicely into our preferred kwarg format, e.g. if we did `slic(image, **metadata)` it would just work.

Honestly, I do not understand. This is already possible now?

Otherwise, I agree, that it is more convenient to write `slic(image)` than `slic(image, spacing_x=metadata[“spacing_x”],…)`, where `image.metadata` is automatically mapped to kwargs. However, this requires the metadata to be standardized and homogeneous. And, at least to my experience - metadata is the one thing that is guaranteed to cause trouble, because it is not consistent. So, what happens when `spacing_x` is not called `spacing_x` but `spacingx`… ?! Either the user actually realizes that, or the “magic”, that maps the metadata to the `slic` parameters: warns the user, silently ignores, fails, etc. All this sounds complicated and intransparent to me.

I think of other use cases, where attaching the metadata directly with the image instead of having a separate dict, brings any benefit. In contrast, skimage is typically used in conjunction with other libraries, which do not know about our wrapper class. So, whenever you call a function from another library, you loose the metadata.

To make it short, I am a strong proponent of making the specification of function parameters explicit, i.e.

	`slic(image, spacing_x=metadata[“spacing_x”], spacing_y=metadata[“spacingy”])`

or

	`slic(image, **metadata)`

which is already possible with all skimage functions.

Johannes Schönberger

On Sep 2, 2014, at 10:16 PM, Josh Warner <silvertrumpet999 at gmail.com> wrote:

> I see this a little differently, looking at it from the viewpoint of a scientific user/new contributor.
> 
> One of the biggest reasons scikit-image has been able to grow so well, so rapidly, is because the bar for entry is low for new contributors. We operate as close to the data as possible. People using Python understand NumPy arrays. This is inherently familiar to those with backgrounds like Matlab, which is likely one of, if not the, largest population we serve. We want these people to be able to contribute without massive bars to entry.
> 
> If we used a custom wrapper class, even a light one, I for one would likely never have contributed to this package in the first place. Because, in order to understand what was even being fed in, I would have to dig up that wrapper, figure out what it actually contains, where my data is in it, think about how to react given different inputs, and so on. This is a tremendously daunting mental task. 
> 
> In short, moving to a new scheme would be, in my humble opinion, a disaster. Imagine having to explain the metadata structure of the hypothetical scikit-image construct to everyone at every SciPy Sprint, every time, before they could work with the codebase. Imagine users who want to know what the code is doing inspecting our source, seeing methods called which aren't standard NumPy methods, wondering what is going on, and having to dig through several other unclearly linked files in our repo to figure that out - if they even get that far. Many have tried to solve the image metadata/tagging problem with few real successes; either it ends up too extensible (no 1:1 mapping of tag to meaning, too much info) or too restrictive (whoops, no 3D or new modality). A perfect system gets devised at first, then edge cases become apparent, followed by workarounds, then someone comes up with a new system that includes everything to date, the API breaks to support it, repeat with new cases... if nothing else, this is a massive support headache.
> 
> From the big-picture view, we need to decide what we are, and what we want to become. To date, scikit-image is a powerful library of tools. We don't purport to hold users' hands through the process of implementation, and we never have. Sometimes the tools are a bit rough around the edges, and they aren't always as easy to use as they could be. But, in return, they end up exposing the maximum amount of functionality to end users. At the risk of a very stretched metaphor, the current state is a bit like Linux.
> 
> Moving to towing around loads of metadata just so we can have complex, custom, branching paths behind every algorithm depending on what happens to be fed in - that's hand holding time. That's the Apple/OSX method. Yes, our package may gain a broader userbase and certain features in examples, etc. might be more elegant. But real world uses which don't follow the examples will be impeded, and more importantly we will alienate future potential contributors. 
> 
> I, for one, do not fancy implementing or maintaining the latter model. I fear for the future of the package even if we did implement it cleanly. There are aspects of the package where limited use of a custom wrapper would likely be reasonable, such as certain more public facing areas e.g. the Viewer, but I see these as the exception rather than the rule.
> 
> My opinions on this issue are strong, but I recognize I'm only one voice. The most important thing we need to reach consensus on is the path forward, as noted above in bold. Up until now it's been rather clear, but wrapping NumPy ndarrays would start us down a very different road, one on which I fear we could easily lose our way.
> 
> Josh
> 
> On Tuesday, September 2, 2014 3:52:34 PM UTC+1, Juan Nunez-Iglesias wrote:
> Cool, looks like there's lots of discussion to be had about this, but all positive — I had expected more pushback, for some reason! =)
> 
> @stefanv:
> > Do we have some use cases for this?
> 
> Spacing and masks, as you mentioned.
> 
> Spatial resolutions, so we can deal with images acquired at different resolutions, e.g. registering light and electron microscopy images of the same sample. Also, as in Tim's talk, to allow drawing of scale bars on displayed images.
> 
> Image orientation and absolute coordinates, which help with registration and montaging.
> 
> Image acquisition date, which could be important e.g. to generate time series.
> 
> That's just off the top of my head, I'm sure there's more!
> 
> @stefanv @tacaswell @tonysyu
> > Please find out how, then we can fix up the io.Image class!
> 
> Tom's link to the solution in PIMS reminds me that it was indeed Dan who pointed out that they'd solved the scalar issue. 
> 
> @glyg
> > I would suggest relying on OME.XML
> 
> I think that might be a little too heavy for our uses, but I might be wrong. Either way, it's probably possible to have a compatibility layer between XML and whatever use cases we want to follow. (e.g. Automatically reading out "PhysicalSizeX/Y/Z" tags and converting those to a "spacing" tag.)
> 
> Juan.
> 
> -- 
> You received this message because you are subscribed to the Google Groups "scikit-image" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scikit-image+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.




More information about the scikit-image mailing list