Image data type ranges

Wed Oct 26 06:06:16 EDT 2011

Am 26.10.2011 10:51, schrieb Neil Yager:
> I was having conversation about data types with StÃ¯Â¿Â½fan in the line
> comments of a PR, and I thought I should move it here so others can
> benefit from his explanations as well.
>
> Being new to the project, I didn't appreciate the intricacies of data
> typing.

After working with images for some time. This still annoys me.
And I don't know of a library having a good solution.
OpenCV for example is a mess. I think this is a very important
topic since it influences usability a lot.

>   For example, I was surprised to see that this raises a
> ValueError:
>
>>>> skimage.img_as_float(np.arange(9).reshape((3, 3)))
> The problem is that the default dtype of np.arange is int32, which
> isn't supported by skimage, so img_as_float doesn't know how to scale
> it to [0, 1]. Perhaps it is correct to fail, as it will force the user
> to consider the data type issue. However, it does seem like a
> reasonable/common thing to want to do.
>
We briefly discussed this issue on the list and Stefan thought
it would be good to make the user think about what they
want to achieve. I find this not completely satisfying but
I could not come up with a better solution.

I do not think that using an np.arange(n) is a reasonable/common
to do by the way. What is the expected behavior?
By definition, the output can be in any range. If you fix any range,
either you'll get out of it for large n or you'll see nothing for small n.

Maybe the most reasonable thing would be to expect that
img_as_float(np.arange(n)) always returns something with
minimum 0 and maximum 1.
The only way to achieve that would be to determine the range
of an int image by taking the max, each time you use it.
This of course would lead to unexpected behavior in other
places. So I'm not sure if it actually makes things better.

We'd have to be careful.

> A related, but different, issue is the following:
>
>>>> x = np.arange(9, dtype=np.uint8).reshape((3, 3))
>>>> x
> array([[0, 1, 2],
>         [3, 4, 5],
>         [6, 7, 8]], dtype=uint8)
>>>> y =  skimage.img_as_ubyte(x.astype(np.float32))
> WARNING:dtype_converter:Possible precision loss, converting from
> float32 to uint8
>>>> y
> array([[  0, 255, 254],
>         [253, 252, 251],
>         [250, 249, 248]], dtype=uint8)
>
I think this is perfectly fine. You used "astype". That's evil!

> The problem here is that the input to img_as_ubyte violates skimage's
> assumption that floating point images have the range [0, 1], leading
> to an unexpected result (at least for a beginner). There is a warning,
> but that's for a different problem. Should img_as_ubyte, img_as_float,
> etc. check and enforce ranges? Or raise warnings? Any thoughts?
Maybe we can check whether the upper bound is satisfied. That
probably wouldn't hurt much if we convert any way.

Also, we should stress in the (at the moment not really existing)
user guide, that users should NEVER EVER use "astype" on an image,
since that violates all our assumptions.

Cheers,
Andy