[scikit-learn] Confidence and Prediction Intervals of Support Vector Regression

Wed Mar 1 17:39:52 EST 2017

Thanks a lot, Sebastian! Very nicely written.

I have a few follow-up questions:
1. Just to make sure I understand correctly, using the .632+ bootstrap
method, the ACC_lower and ACC_upper are the lower and higher percentile of
the ACC_h,i distribution?
2. For regression algorithms, is there a recommended equation for the
no-information rate gamma?
3. I need to plot the confidence interval and prediction interval for my
Support Vector Regression prediction (just to clarify these intervals,
please see an analogy from linear model on slide 14:
http://www2.stat.duke.edu/~tjl13/s101/slides/unit6lec3H.pdf) - can I derive
the intervals from .632+ bootstrap method or is there a different way of
getting these intervals?

Thank you!
Raga

On Wed, Mar 1, 2017 at 3:13 PM, Sebastian Raschka <se.raschka at gmail.com>
wrote:

> Hi, Raga,
> I have a short section on this here (https://sebastianraschka.com/
> blog/2016/model-evaluation-selection-part2.html#the-bootstrap-method-and-
> empirical-confidence-intervals) if it helps.
>
> Best,
> Sebastian
>
> > On Mar 1, 2017, at 3:07 PM, Raga Markely <raga.markely at gmail.com> wrote:
> >
> > Hi everyone,
> >
> > I wonder if you could provide me with some suggestions on how to
> determine the confidence and prediction intervals of SVR? If you have
> suggestions for any machine learning algorithms in general, that would be
> fine too (doesn't have to be specific for SVR).
> >
> > So far, I have found:
> > 1. Bootstrap: http://stats.stackexchange.com/questions/183230/
> bootstrapping-confidence-interval-from-a-regression-prediction
> > 2. http://journals.plos.org/plosone/article/file?id=10.
> 1371/journal.pone.0048723&type=printable
> > 3. ftp://ftp.esat.kuleuven.ac.be/sista/suykens/reports/10_156_v0.pdf
> >
> > But, I don't fully understand the details in #2 and #3 to the point that
> I can write a step by step code. If I use bootstrap method, I can get the
> confidence interval as follows?
> > a. Draw bootstrap sample of size n
> > b. Fit the SVR model (with hyperparameters chosen during model selection
> with grid search cv) to this bootstrap sample
> > c. Use this model to predict the output variable y* from input variable
> X*
> > d. Repeat step a-c for, for instance, 100 times
> > e. Order the 100 values of y*, and determine, for instance, the 10th
> percentile and 90th percentile (if we are looking for 0.8 confidence
> interval)
> > f. Repeat a-e for different values of X* to plot the prediction with
> confidence interval
> >
> > But, I don't know how to get the prediction interval from here.
> >
> > Thank you very much,
> > Raga
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170301/970ca067/attachment.html>