[SciPy-User] Python significance / error interval / confidence interval module?
Christoph Deil
Deil.Christoph at googlemail.com
Mon Jun 20 19:16:02 EDT 2011
On Jun 17, 2011, at 8:12 PM, josef.pktd at gmail.com wrote:
> On Fri, Jun 17, 2011 at 1:08 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>> On 06/17/2011 11:21 AM, josef.pktd at gmail.com wrote:
>>> On Fri, Jun 17, 2011 at 11:12 AM, Gael Varoquaux
>>> <gael.varoquaux at normalesup.org> wrote:
>>>> On Fri, Jun 17, 2011 at 05:08:16PM +0200, Christoph Deil wrote:
>>>>> I am looking for a python module for significance / error interval /
>>>>> confidence interval computation.
>>>> How about http://pypi.python.org/pypi/uncertainties/
>>>>
>>>>> Specifically I am looking for Poisson rate estimates in the presence of
>>>>> uncertain background and / or efficiency, e.g. for an "on/off
>>>>> measurement".
>>>> Wow, that seems a bit more involved than Gaussian error statistics. I am
>>>> not sure that the above package will solve your problem.
>>>>
>>>>> The standard method of Rolke I am mainly interested in is available in
>>>>> ROOT and RooStats, a C++ high energy physics data analysis package:
>>>> If you really need proper Poisson-rate errors, then you might indeed not
>>>> to translate the Rolke method to Python. How about contributing it to
>>>> uncertainties.
Gael, the uncertainties package ( http://packages.python.org/uncertainties/ ) is only for error propagation,
not error computation, so I don't think methods for Poisson-rate error computation would fit there.
By the way: everyone doing data analysis needs to propagate errors sometimes.
In my opinion uncertainties is so useful that its functionality should be included in scipy.
>>> It's a very specific model, and I doubt it's covered by any general
>>> packages, but implementing
>>> http://lanl.arxiv.org/abs/physics/0403059
>>> assuming this is the background for it, doesn't sound too difficult.
>>>
>>> The main work it looks like is keeping track of all the different
>>> models and parameterizations.
>>> scipy.stats.distributions and scipy.optimize (fmin, fsolve) will cover
>>> much of the calculations.
>>>
>>> (But then of course there is testing and taking care of corner cases
>>> which takes at least several times as long as the initial
>>> implementation, in my experience.)
>>>
>>> Josef
>>>>
>> Actually I am more interested in how this differs from a generalized
>> linear model where modeling Poisson or negative binomial distribution is
>> feasible.
>> Bruce
>
> That was my first guess, but in the paper it's pretty different, in
> the paper the assumption is that two variables are observed, x,y,
> which each have different independent distribution, but have some
> parameters in common
>
> X ∼ Pois(μ + b), Y ∼ Pois(
b)
>
> or variations on this like
> X ∼ Pois(eμ + b), Y ∼ N(b, sigma_b), Z ∼ N(e, sigma_e)
>
> The rest is mostly profile likelihood from a quick skimming of the
> paper, to get confidence intervals on mu, getting rid of the nuisance
> parameter
>
> Josef
Josef, thanks a lot for your helpful comments!
More information about the SciPy-User
mailing list