img_as_float

Thu Feb 18 19:10:07 EST 2016

Sometimes the input dtype needs to change, at least along the way. As just 
one example:

   - uint8 or uint16 inputs with a chain of calculations, including 
   transformations or exposure tweaks. In this instance, all intermediate 
   calculations should be carried out with full floating-point precision. If 
   forced back into their originating dtype at each step, the result would 
   have terrible compounded error. 

Returning to the original dtype at the end would be reasonable, but you 
only want to do this once. Because of our functional approach (vs. VTK's 
pipelining or similar), there is no way for us to know which step is the 
final one. So - if desired - the user needs to handle this, because from 
such functions we'll always return the higher precision.

We always return a new object, unless the function explicitly operates on 
the input. When this is possible it is enabled by a standard `out=None` 
kwarg like in numpy/scipy.

One of the biggest things the "float images are on range [0, 1]" saves us 
from is worrying about aliasing. At all. We just do calculations, it 
doesn't matter if the input image gets squared a few times along the way. 
Try to do a few simple numpy operations on a uint8 array and see how fast 
the results aren't what you expect. Now, we can relax this and still be 
mostly OK because float64 is big. But concerns like this are a huge 
potential maintenance headache. I think what Stefan means by "full 
potential range" is that you have to plan calculations in advance, 
examining every intermediate step for its maximum potential range, against 
your dtype.

Certain exposure calculations are explicitly defined with normalized images 
on the range [0, 1], because they heavily use exponential functions. An 
input with a greater range must be handled carefully by any such function. 
This is the greatest danger in simply removing the normalization step from 
the package, IMO. A lot of things will break, and depending on the 
algorithm the fix may vary.

Perhaps that helps pull back the curtain a little...

Josh

On Thursday, February 18, 2016 at 4:00:04 PM UTC-7, Michael Aye wrote:
>
> I am not opposed to supporting range preservation, but we do have to 
>
>> think a bit about the implications: 
>>
>> - what do you do when the data-type of values change? 
>>
>
> What are the situations where they *have* to change? 
>  
>
>> - what do you do when your operation due to, e.g., rounding issues 
>> push values outside the input range? 
>>
>
> Return a new object instead of changing the original, maybe?
>  
>
>> - what do you do when you need to know the full potential range of the 
>> data? 
>>
>>  don't understand, do you mean the full potential range per data-type? 
> isn't that defined by the data-type the input image has?
>
> The ``preserve_range`` flag has allowed us to do whatever we do 
>> normally, unless the user gave explicit permission to change data 
>> types, ranges, etc.  It also serves as a nice tag for "I, the 
>> developer, thought about this issue". 
>>
>> And that's quite cool that that's offered, but the question is, I guess, 
> which default is best and why? 
> Which default setting would confuse the least new (and old) users?
>
> Michael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-image/attachments/20160218/9b441336/attachment.html>