<div dir="ltr"><div dir="ltr">hi Matt,<div><br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir="ltr"><div>I'd like to implement a forward stepwise regression algorithm using the efficient procedure described in the first problem <a href="http://stat.rutgers.edu/home/hxiao/stat588_2011/hw1.pdf" target="_blank">here</a>. It does not seem that such a model exists anywhere in Python. Would it be useful for me to write this model up for sklearn?</div></div></div></blockquote><div><br></div><div>to be considered I would first ask you to evaluate and discuss what you think it will bring</div><div>over existing estimators. Typically do you foresee a clear benefit compared to Lars or LassoLars ?</div><div><br></div><div>For more see <a href="https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms">https://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms</a></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir="ltr"><div>If you're interested, here's a high-level view of how I think it would work:</div><div><br></div><div>- The model would have sklearn.linear_model.LinearRegression as its base class. </div><div>- The additional model parameters would include </div><div><ul><li style="margin-left:15px">An array of the indices (or column names) of the features in X1</li><li style="margin-left:15px">The Q and R matrices</li></ul></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir="ltr"><div><div>- The additional methods would include</div><div><ul><li style="margin-left:15px">An add_features() method that adds a specified number of features to the model. Updates all model parameters</li><li style="margin-left:15px">A fit() method that requires a specification of the number of parameters to fit and optional sample weight. It calls the add_features method once on a model with no features.</li></ul></div></div></div></div></blockquote><div><br></div><div>the API of scikit-learn estimator is quite strict. See<br></div><div><br></div><div><a href="https://scikit-learn.org/stable/developers/develop.html?highlight=check_estimator">https://scikit-learn.org/stable/developers/develop.html?highlight=check_estimator</a><br></div><div><br></div><div>I invite you to read <a href="https://scikit-learn.org/stable/developers/contributing.html?highlight=contribut">https://scikit-learn.org/stable/developers/contributing.html?highlight=contribut</a></div><div>if you are willing to help the team.</div><div><br></div><div>Alex</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir="ltr"><div>I would do this for OLS first, but supposedly it could be adapted for regularized models as well. </div><div><br></div><div>How does this sound?</div><div><br></div><div>Thanks,</div><div><br></div><div>Matt S.</div></div>
</div>
_______________________________________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/scikit-learn</a><br>
</blockquote></div></div>