titanic dataset, use for book
Hello Scikit-learn team, I am currently writing a book for Packt on feature engineering, where I plan to show how to use the newest sklearn transformers. Could I confirm with you whether I can use the titanic dataset located here: titanic_url = ('https://raw.githubusercontent.com/amueller/' 'scipy-2017-sklearn/091d371/notebooks/datasets/titanic3.csv') in the book? The code where I found it, seems to have a BSD license, but I am not sure whether the license extends to the use of the dataset as well. https://scikit-learn.org/stable/auto_examples/compose/plot_column_transforme... Many thanks and I look forward to hearing from you Kind regards Sole
Hi Sole. I would suggest not to use this version of the titanic dataset. It's a personal repository of mine and might not exist forever. Ideally you (and we) would use fetch_openml. However, the current version doesn't have support for returning dataframes. That's addressed in https://github.com/scikit-learn/scikit-learn/pull/13902 which is not merged yet. By the time your book comes out, it's likely to be merged, but might not be released, depending on your timeline. It might be easier for your to upload the CSV file to a repository you control yourself. Best, Andy On 6/24/19 4:01 AM, Sole Galli wrote:
Hello Scikit-learn team,
I am currently writing a book for Packt on feature engineering, where I plan to show how to use the newest sklearn transformers.
Could I confirm with you whether I can use the titanic dataset located here: titanic_url = ('https://raw.githubusercontent.com/amueller/' 'scipy-2017-sklearn/091d371/notebooks/datasets/titanic3.csv')
in the book?
The code where I found it, seems to have a BSD license, but I am not sure whether the license extends to the use of the dataset as well. https://scikit-learn.org/stable/auto_examples/compose/plot_column_transforme...
Many thanks and I look forward to hearing from you
Kind regards
Sole
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Meanwhile, loading the CSV from OpenML (https://www.openml.org/d/40945) would also work, pd.read_csv('https://www.openml.org/data/get_csv/16826755/phpMYEkMl') -- Roman On 25/06/2019 17:04, Andreas Mueller wrote:
By the time your book comes out, it's likely to be merged, but might not be released, depending on your timeline. It might be easier for your to upload the CSV file to a repository you control yourself.
Thank you! that's very helpful :) On Thu, 27 Jun 2019 at 12:27, Roman Yurchak via scikit-learn < scikit-learn@python.org> wrote:
Meanwhile, loading the CSV from OpenML (https://www.openml.org/d/40945) would also work,
pd.read_csv('https://www.openml.org/data/get_csv/16826755/phpMYEkMl')
-- Roman
On 25/06/2019 17:04, Andreas Mueller wrote:
By the time your book comes out, it's likely to be merged, but might not be released, depending on your timeline. It might be easier for your to upload the CSV file to a repository you control yourself.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (3)
-
Andreas Mueller -
Roman Yurchak -
Sole Galli