[scikit-learn] titanic dataset, use for book

Andreas Mueller t3kcit at gmail.com
Tue Jun 25 11:04:11 EDT 2019


Hi Sole.
I would suggest not to use this version of the titanic dataset.
It's a personal repository of mine and might not exist forever.
Ideally you (and we) would use fetch_openml.
However, the current version doesn't have support for returning dataframes.
That's addressed in https://github.com/scikit-learn/scikit-learn/pull/13902
which is not merged yet.

By the time your book comes out, it's likely to be merged, but might not 
be released, depending on your timeline.
It might be easier for your to upload the CSV file to a repository you 
control yourself.

Best,
Andy

On 6/24/19 4:01 AM, Sole Galli wrote:
> Hello Scikit-learn team,
>
> I am currently writing a book for Packt on feature engineering, where 
> I plan to show how to use the newest sklearn transformers.
>
> Could I confirm with you whether I can use the titanic dataset located 
> here:
> titanic_url  =  ('https://raw.githubusercontent.com/amueller/'
>                 'scipy-2017-sklearn/091d371/notebooks/datasets/titanic3.csv')
>
> in the book?
>
> The code where I found it, seems to have a BSD license, but I am not 
> sure whether the license extends to the use of the dataset as well.
> https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#sphx-glr-auto-examples-compose-plot-column-transformer-mixed-types-py 
>
>
>  Many thanks and I look forward to hearing from you
>
> Kind regards
>
> Sole
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190625/0a757133/attachment.html>


More information about the scikit-learn mailing list