What about the EM algorithm. You could fit a mixture of an exponential and a [whatever] distribution? since you seem to already believe that that's what it is?

Isn't it almost what you already have in mind already, just with soft thresholding?






Mark Daoust

On Wed, Mar 11, 2015 at 7:36 PM, Antonino Ingargiola <tritemio@gmail.com> wrote:
Hi Kevin,

If I apply the log transform to the sample to linearize the models, what is the correct way to weight the residuals? Without weighting residual close to the tail will be amplified and bias the fit.

Antonio

On Wed, Mar 11, 2015 at 11:08 AM, Kevin Gullikson <kevin.gullikson@gmail.com> wrote:
Antonio,

The statsmodels package has a robust linear model module that I have used before. You will have to transform your data to be linear first by taking the log of the y-axis.



Kevin Gullikson

On Wed, Mar 11, 2015 at 12:04 PM, Antonino Ingargiola <tritemio@gmail.com> wrote:
Hi to the list,

I'm seeking the advise of the scientific python community to solve the following fitting problem. Both suggestions on the methodology and on particular software packages are appreciated.

I often encounter the need to fit a sample containing a (dominant) exponentially-distributed sub-population. Mostly the non-exponential samples (from an unknown distribution) are distributed close to the origin of the exponential distribution, therefore a simple approach I used so far is selecting all the samples higher than a threshold and fitting the exponential "tail" with MLE.

The problem is that the choice of the threshold is somewhat arbitrary and moreover there can be a small set of outlier on the extreme right-side of the distribution that would bias the MLE fit.

To improve the accuracy, I'm thinking of using (if necessary implementing) some kind of robust fitting procedure. For example using a scheme in which the outlier are identified by putting a threshold on the residual and then this threshold is optimized using some "goodness of fit" cost function. If this approach reasonable?

I am surely not the first to tackle this problem, so I would appreciated some suggestion and specific pointers to help me getting started.

Thank you,
Antonio

_______________________________________________
SciPy-User mailing list
SciPy-User@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
SciPy-User@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user



_______________________________________________
SciPy-User mailing list
SciPy-User@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user