[SciPy-User] Fwd: one-sided gauss fit -- or: how to estimate backgound noise ?

Tue Jul 20 15:47:27 EDT 2010

On Tue, Jul 20, 2010 at 9:33 PM, Zachary Pincus <zachary.pincus at yale.edu> wrote:
>> Zach,
>> thanks for your reply. The idea is to calculate the mean/std of the
>> *background noise* of the underlying (2d or 3d) image data based on
>> the image's 1d image intensity histogram.
>> Regarding the "tail", the problem is that in general the signal
>> intensities are not well separated from the background. Thus, the
>> right half of the background's (Gaussian) noise distribution may
>> already be significantly miss-shaped - whereas the left side, i.e. all
>> values below the mean background level, should be nicely Bell-shape
>> distributed. (This idea comes also from looking at many hundreds of
>> image histogram [my wx/OpenGL based image viewer always displays the
>> image intensity histogram right below the image])
>
> Right... if you're in the regime that I'm in where there are orders of
> magnitude more background pixels than foreground ones, then a robust
> estimator of mean/std (e.g. one that's immune to "outliers" aka
> foreground pixel intensities) has, in my experience, worked very well
> for this precise task: estimating the mean/std of the majority of the
> pixels (background) in the presence of possibly many outlier pixels
> (foreground). I seem to recall that the MCD estimator can work in the
> presence of ~30% outliers...
>
> Zach

I'm hoping that it even works for biological image data with much less
background - as long as the signal is always above the mean background
....
I'll do some tests and report back.

-Sebastian

>
>
>>> From your answer I just got the idea of changing the error function
>>> in
>> such a way to only sum up the datapoint-model-errors (IOW, build the
>> RMS) over intensities smaller than the models mean value.  So, the
>> resulting question becomes if the error function has the Gaussian-mean
>> value of the "current" fitting step available to it !? I can probably
>> find that out.
>>
>>
>> Thanks again,
>> Sebastian
>>
>>
>>
>>
>>> Though, now that I think on it, I seem to recall that the EM
>>> algorithm
>>> was originally deployed to estimate parameters of gaussians with
>>> censored tails. (I think the problem was: how tall was the average
>>> Frenchman, given distribution of heights of French soldiers and the
>>> knowledge that there was a height minimum for army service?) I think
>>> you just estimate the mean/std from the censored data, then fill in
>>> the censored tails with samples from the fit distribution, and then
>>> re-
>>> estimate the mean/std from the new data, etc. I forget exactly how
>>> one
>>> does this (does it work on the histogram, or the underlying data,
>>> e.g.) but that's the general idea.
>>>
>>> Zach
>>>
>>>
>>>> Thanks,
>>>> Sebastian.
>>>>
>>>>
>>>> On Fri, Jul 16, 2010 at 1:04 AM, Christoph Deil
>>>> <Deil.Christoph at googlemail.com> wrote:
>>>>> Hi Sebastian,
>>>>>
>>>>> in astronomy a method called kappa-sigma-clipping is sometimes used
>>>>> to estimate the background level by clipping away most of the
>>>>> signal:
>>>>> http://idlastro.gsfc.nasa.gov/ftp/pro/math/meanclip.pro
>>>>> I am not aware of a python implementation, but it's just a few
>>>>> lines of code.
>>>>>
>>>>> If you can identify the background level approximately by eye,
>>>>> e.g. by plotting a histogram of your data, you should be able to
>>>>> just fit the tail of the Gaussian that only contains background.
>>>>>
>>>>> Here is my attempt at doing such a fit using
>>>>> scipy.stats.rv_continous.fit(),
>>>>> similar to but not exactly what you want:
>>>>>
>>>>> from scipy.stats import norm, halfnorm, uniform
>>>>> signal = - uniform.rvs(0, 3, size=10000)
>>>>> background = norm.rvs(size=10000)
>>>>> data = hstack((signal, background))
>>>>> hist(data, bins=30)
>>>>> selection = data[data>0]
>>>>> halfnorm.fit(selection)
>>>>> x = linspace(-3, 3, 100)
>>>>> y = selection.sum() * halfnorm.pdf(x)/3
>>>>> plot(x,y)
>>>>>
>>>>> Good luck!
>>>>> Christoph
>>>>>
>>>>> On Jul 15, 2010, at 10:39 PM, Sebastian Haase wrote:
>>>>>
>>>>>> Hi,
>>>>>> In image analysis one is often faced with (often unknown)
>>>>>> background
>>>>>> levels (offset) + (Gaussian) background noise.
>>>>>> The overall intensity histogram of the image is in fact often
>>>>>> Gaussian
>>>>>> (Bell shaped), but depending on how many (foreground) objects are
>>>>>> present the histogram shows a positive tail of some sort.
>>>>>>
>>>>>> So, I just got the idea if there was a function (i.e. mathematical
>>>>>> algorithm) that would allow to fit only the left half of a
>>>>>> Gaussian
>>>>>> bell curve to data points !?
>>>>>> This would have to be done in a way that the center, the variance
>>>>>> (or
>>>>>> sigma) and the peak height are free fitting parameters.
>>>>>>
>>>>>> Any help or ideas are appreciated,
>>>>>> thanks
>>>>>> Sebastian Haase
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>