[SciPy-User] one-sided gauss fit -- or: how to estimate backgound noise ?
Zachary Pincus
zachary.pincus at yale.edu
Thu Jul 15 18:13:28 EDT 2010
Hi Sebastian,
> In image analysis one is often faced with (often unknown) background
> levels (offset) + (Gaussian) background noise.
> The overall intensity histogram of the image is in fact often Gaussian
> (Bell shaped), but depending on how many (foreground) objects are
> present the histogram shows a positive tail of some sort.
>
> So, I just got the idea if there was a function (i.e. mathematical
> algorithm) that would allow to fit only the left half of a Gaussian
> bell curve to data points !?
> This would have to be done in a way that the center, the variance (or
> sigma) and the peak height are free fitting parameters.
For this task, I usually use some form of robust estimator for the
mean and std, which is designed to ignore noise in the tails.
Below I've pasted code that I use for an "minimum covariance
determinant" estimate, which is translated from some matlab code I
found online.
For large images, it's slow and you'll probably want to randomly
sample pixels to feed to the MCD estimator instead of using the entire
image. And there are probably many simpler, faster robust estimators
(like cutting off the tails, etc.) that are out there.
Zach
import numpy
import scipy.stats as stats
def unimcd(y,h):
"""unimcd(y, h) -> subset_mask
unimcd computes the MCD estimator of a univariate data set. This
estimator is
given by the subset of h observations with smallest variance. The
MCD location
estimate is then the mean of those h points, and the MCD scale
estimate is
their standard deviation.
A boolean mask is returned indicating which elements of the input
array are
in the MCD subset.
The MCD method was introduced in:
Rousseeuw, P.J. (1984), "Least Median of Squares Regression,"
Journal of the American Statistical Association, Vol. 79, pp.
871-881.
The algorithm to compute the univariate MCD is described in
Rousseeuw, P.J., Leroy, A., (1988), "Robust Regression and Outlier
Detection," John Wiley, New York.
This function based on UNIMCD from LIBRA: the Matlab Library for
Robust
Analysis, available at: http://wis.kuleuven.be/stat/robust.html
"""
y = numpy.asarray(y, dtype=float)
ncas = len(y)
length = ncas-h+1
if length <= 1:
return numpy.ones(len(y), dtype=bool)
indices = y.argsort()
y = y[indices]
ind = numpy.arange(length-1)
ay = numpy.empty(length)
ay[0] = y[0:h].sum()
ay[1:] = y[ind+h] - y[ind]
ay = numpy.add.accumulate(ay)
ay2=ay**2/h
sq = numpy.empty(length)
sq[0] = (y[0:h]**2).sum() - ay2[0]
sq[1:] = y[ind+h]**2 - y[ind]**2 + ay2[ind] - ay2[ind+1]
sq = numpy.add.accumulate(sq)
sqmin=sq.min()
ii = numpy.where(sq==sqmin)[0]
Hopt = indices[ii[0]:ii[0]+h]
ndup = len(ii)
slutn = ay[ii]
initmean=slutn[numpy.floor((ndup+1)/2 - 1)]/h
initcov=sqmin/(h-1)
# calculating consistency factor
res=(y-initmean)**2/initcov
sortres=numpy.sort(res)
factor=sortres[h-1]/stats.chi2.ppf(float(h)/ncas,1)
initcov=factor*initcov
res=(y-initmean)**2/initcov #raw_robdist^2
quantile=stats.chi2.ppf(0.975,1)
weights=res<quantile
weights=weights[indices.argsort()] #rew_weights
return weights
More information about the SciPy-User
mailing list