[scikit-learn] About the Boston housing prices dataset

Olivier Grisel olivier.grisel at ensta.org
Wed Oct 14 04:10:33 EDT 2020


Le mar. 13 oct. 2020 à 16:19, Adrin <adrin.jalali at gmail.com> a écrit :
>
> Isn't the Boston dataset available through openml? Maybe here: https://www.openml.org/d/531
>
> I'm happy to have the dataset out there on opemml, and for any material that addresses some of the issues with it.
> But for educational purposes, we don't need to have the dataset in the package as long as users can still download it
> with a oneliner using fetch_openml.

That would be an argument in favor of deprecation warning with a
message stating the motivation for deprecation and pointing to
fetch_openml.

However it's going to break examples written in slow to update
tutorials or book once the deprecation period is over. But one could
argue that this is also the case for any other deprecation in
scikit-learn. It's just that sklearn.datasets.load_boston is used A
LOT: https://github.com/search?q=load_boston&type=code

-- 
Olivier


More information about the scikit-learn mailing list