[IPython-dev] RandomForestClassifier w/ IPython.parallel

Alessandro Gagliardi alessandro.gagliardi at glassdoor.com
Fri Feb 7 15:36:24 EST 2014

Not sure if I’m addressing the best list for this question, so if there’s a more appropriate list, please direct me to it.

I want to run a large sklearn.ensemble.RandomForestClassifier (with maybe a dozens or maybe hundreds of trees and 100,000 samples). My desktop won’t handle this so I want to try using StarCluster. RandomForestClassifier seems to parallelize easily, but I don’t know how I would split it across many IPython.parallel engines (if that’s even possible). (Or maybe I should be foregoing IPython.parallel and using MPI?)

Any help would be greatly appreciated.


Alessandro Gagliardi| Glassdoor| alessandro at glassdoor.com<mailto:alessandro at glassdoor.com>
We’re hiring! Check out our open jobs<http://www.glassdoor.com/about/careers.htm>.
Twitter<https://twitter.com/Glassdoor> | Facebook<https://www.facebook.com/Glassdoor>  | Glassdoor Blog<http://www.glassdoor.com/blog/>
2012 Webby Award Winner: Best Employment Site
2013 Webby Award Winner: Best Guides/Ratings/Review Site
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140207/443aca8a/attachment.html>

More information about the IPython-dev mailing list