Antonio, The statsmodels package has a robust linear model module that I have used before. You will have to transform your data to be linear first by taking the log of the y-axis. http://statsmodels.sourceforge.net/stable/examples/notebooks/generated/robus... Kevin Gullikson On Wed, Mar 11, 2015 at 12:04 PM, Antonino Ingargiola <tritemio@gmail.com> wrote:
Hi to the list,
I'm seeking the advise of the scientific python community to solve the following fitting problem. Both suggestions on the methodology and on particular software packages are appreciated.
I often encounter the need to fit a sample containing a (dominant) exponentially-distributed sub-population. Mostly the non-exponential samples (from an unknown distribution) are distributed close to the origin of the exponential distribution, therefore a simple approach I used so far is selecting all the samples higher than a threshold and fitting the exponential "tail" with MLE.
The problem is that the choice of the threshold is somewhat arbitrary and moreover there can be a small set of outlier on the extreme right-side of the distribution that would bias the MLE fit.
To improve the accuracy, I'm thinking of using (if necessary implementing) some kind of robust fitting procedure. For example using a scheme in which the outlier are identified by putting a threshold on the residual and then this threshold is optimized using some "goodness of fit" cost function. If this approach reasonable?
I am surely not the first to tackle this problem, so I would appreciated some suggestion and specific pointers to help me getting started.
Thank you, Antonio
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user