Hi to the list,

I'm seeking the advise of the scientific python community to solve the following fitting problem. Both suggestions on the methodology and on particular software packages are appreciated.

I often encounter the need to fit a sample containing a (dominant) exponentially-distributed sub-population. Mostly the non-exponential samples (from an unknown distribution) are distributed close to the origin of the exponential distribution, therefore a simple approach I used so far is selecting all the samples higher than a threshold and fitting the exponential "tail" with MLE.

The problem is that the choice of the threshold is somewhat arbitrary and moreover there can be a small set of outlier on the extreme right-side of the distribution that would bias the MLE fit.

To improve the accuracy, I'm thinking of using (if necessary implementing) some kind of robust fitting procedure. For example using a scheme in which the outlier are identified by putting a threshold on the residual and then this threshold is optimized using some "goodness of fit" cost function. If this approach reasonable?

I am surely not the first to tackle this problem, so I would appreciated some suggestion and specific pointers to help me getting started.

Thank you,
Antonio