[scikit-learn] batch_size for small training sets

Sebastian Raschka se.raschka at gmail.com
Sun Sep 24 16:47:05 EDT 2017


Small batch sizes are typically used to speed up the training (more iterations) and to avoid the issue that training sets usually don’t fit into memory. Okay, the additional noise from the stochastic approach may also be helpful to escape local minima and/or help with generalization performance (eg as discussed in the recent paper where the authors compared SGD to other optimizers). In any case, since batch size is effectively a hyper parameter I would just experiment with a few values and compare. Also, since you have a small dataset, I would maybe also try to just go with batch gradient descent (I.e batch size = n training samples).

Best,
Sebastian 

Sent from my iPhone

> On Sep 24, 2017, at 4:35 PM, Thomas Evangelidis <tevang3 at gmail.com> wrote:
> 
> Greetings,
> 
> I traing MLPRegressors using small datasets, usually with 10-50 observations. The default batch_size=min(200, n_samples) for the adam optimizer, and because my n_samples is always < 200, it is eventually batch_size=n_samples. According to the theory, stochastic gradient-based optimizers like adam perform better in the small batch regime. Considering the above, what would be a good batch_size value in my case (e.g. 4)? Is there any rule of thump to select the batch_size when the n_samples is small or must the choice be based on trial and error?
> 
> 
> -- 
> ======================================================================
> Dr Thomas Evangelidis
> Post-doctoral Researcher
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/2S049, 
> 62500 Brno, Czech Republic 
> 
> email: tevang at pharm.uoa.gr
>          	tevang3 at gmail.com
> 
> website: https://sites.google.com/site/thomasevangelidishomepage/
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170924/dd216167/attachment.html>


More information about the scikit-learn mailing list