
Replying to Skipper's message to get the Statsmodels folks...
On Wed, Aug 8, 2012 at 2:38 PM, Constantine Evans <cevans@evanslabs.org> wrote:
Hello everyone,
On Wed, Aug 8, 2012 at 11:57 AM, Skipper Seabold <jsseabold@gmail.com> wrote:
Hi,
A few years ago I implemented a scikit for bootstrap confidence limits (https://github.com/cgevans/scikits-bootstrap). I didn’t think much about it after that until recently, when I realized that some people are actually using it, and that there’s apparently been some talk about implementing this functionality in either scipy.stats or statsmodels (I should thank Randal Olson for discussing this and bringing it to my attention).
As such I’ve rewritten most of the code, and written up some docstrings. The current code can do confidence intervals with basic percentile interval, bias-corrected accelerated, and approximate bootstrap confidence methods, and can also provide bootstrap and jackknife indexes. Most of it is implemented from the descriptions in Efron and Tibshirani’s Introduction to the Bootstrap, but the ABC code at the moment is a port from the modified-BSD-licensed bootstrap package for R (not the boot package) as I’m not entirely confident in my understanding of the method.
I can't comment on the ABC method, but your BCA method appears to be consistent with my own implementation.
And so, I have a few questions for everyone:
* Is there any interest in including this sort of code in either scipy.stats or statsmodels? If so, where do people think would be the better place? The code is relatively small; at the moment it is less than 200 lines, with docstrings probably making up 100 of those lines.
I think it would be great to have this in statsmodels. I filed an enhancement ticket about it this morning (also brought to my attention by Randy's blog post).
As a user, I would also love to see this in statsmodels
* Also, if so, what would need to be changed, added, and improved beyond what is mentioned in the Contributing to Scipy part of the reference guide? I’m never a fan of my own code, and imagine quite a bit would need to be fixed; I know tests will need to be added too.
I can only speak to the BCA method, but I propose the following when you compute the acceleration: https://gist.github.com/3307341 Everyone's data is different and probably 99.99% of the time, SCD won't turn out to be 0 and raise a ZeroDivision error, but it happened to me and that's how I fixed it. Just a thought. Cheers, -paul