Hey all. In case you're interested, here is a summary view of the scikit-learn survey I posted recently: https://www.surveymonkey.com/results/SM-RHGZVZ73/ tldr; Preprocessing takes the most time, people want out-of-core learning, better integration with pandas and easier visualization of models and data. People would use automatic machine learning if it was there, but it's not the highest priority item. There is also a lot of interesting info in the comments, but because I was not able to go through all of them yet, I don't want to publish them publicly in case there is sensitive information included (and if anyone knows if there are legal implications if there wasn't a disclaimer, please let me know). Cheers, Andy
Thanks Andy, That's really interesting and gives some hints for future direction. As an initial suggestion, I wonder if incremental decision tree learning would be welcomed by the project? My personal experience building trees was very often frustrated by memory constraints and an alternative that uses batches would allow the technique to scale up to much larger datasets that don't fit in memory. Regards Brian On 5 March 2017 at 17:47, Andreas Mueller <t3kcit@gmail.com> wrote:
Hey all. In case you're interested, here is a summary view of the scikit-learn survey I posted recently: https://www.surveymonkey.com/results/SM-RHGZVZ73/
tldr; Preprocessing takes the most time, people want out-of-core learning, better integration with pandas and easier visualization of models and data. People would use automatic machine learning if it was there, but it's not the highest priority item.
There is also a lot of interesting info in the comments, but because I was not able to go through all of them yet, I don't want to publish them publicly in case there is sensitive information included (and if anyone knows if there are legal implications if there wasn't a disclaimer, please let me know).
Cheers, Andy _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Hi Brian. How about mondrian forests? ;) And I think Gilles has thought about parallelizing trees a bit. It's definitely something that people are interested in. Andy On 03/06/2017 06:46 AM, Brian Holt wrote:
Thanks Andy,
That's really interesting and gives some hints for future direction. As an initial suggestion, I wonder if incremental decision tree learning would be welcomed by the project? My personal experience building trees was very often frustrated by memory constraints and an alternative that uses batches would allow the technique to scale up to much larger datasets that don't fit in memory.
Regards Brian
On 5 March 2017 at 17:47, Andreas Mueller <t3kcit@gmail.com <mailto:t3kcit@gmail.com>> wrote:
Hey all. In case you're interested, here is a summary view of the scikit-learn survey I posted recently: https://www.surveymonkey.com/results/SM-RHGZVZ73/ <https://www.surveymonkey.com/results/SM-RHGZVZ73/>
tldr; Preprocessing takes the most time, people want out-of-core learning, better integration with pandas and easier visualization of models and data. People would use automatic machine learning if it was there, but it's not the highest priority item.
There is also a lot of interesting info in the comments, but because I was not able to go through all of them yet, I don't want to publish them publicly in case there is sensitive information included (and if anyone knows if there are legal implications if there wasn't a disclaimer, please let me know).
Cheers, Andy _______________________________________________ scikit-learn mailing list scikit-learn@python.org <mailto:scikit-learn@python.org> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
participants (3)
-
Andreas Mueller -
Brian Holt -
Tim Head