Hi Isaac,

You may have a look at MiniBatchKMeans and MiniBatchDictionaryLearning that both proposes this API. At the moment, you should fit a single mini batch to the estimator using partial_fit, and update the inner attributes accordingly. During the first partial_fit, you should take care of various memory allocation that are needed by the estimator.

Please fill free to create a pull request whenever you think your code is ready for review.

Good luck!

Le 26 mai 2016 13:14, <donkey-hotei@cryptolab.net> a écrit :
hello scikit-learn devs,

After following the work on IsolationForest so far and testing on a real-world problem here we've found this model to be very promising for anomaly detection. However, at present, IsolationForest only fits data in batch even while it may be well suited to incremental on-line learning since one could subsample recent history and older estimators can be dropped progressively.

I'd like to contribute this feature, but being new to ML and scikit-learn I'm curious how I should start making a quick & dirty version to see how this may work. Are there other good examples where one could see the difference between .fit and .partial_fit in other models?

thanks
isaak y.
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn