Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
524 views
in Technique[技术] by (71.8m points)

machine learning - What is batch size in Caffe or convnets

I thought that batch size is only for performance. The bigger the batch, more images are computed at the same time to train my net. But I realized, if I change my batch size, my net accuracy gets better. So I did not understand what batch size is. Can someone explain me what is batch size?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Caffe is trained using Stochastic-Gradient-Descend (SGD): that is, at each iteration it computes the (stochastic) gradient of the parameters w.r.t the training data and makes a move (=change the parameters) in the direction of the gradient.
Now, if you write the equations of the gradient w.r.t. training data you'll notice that in order to compute the gradient exactly you need to evaluate all your training data at each iteration: this is prohibitively time consuming, especially when the training data gets bigger and bigger.
In order to overcome this, SGD approximates the exact gradient, in a stochastic manner, by sampling only a small portion of the training data at each iteration. This small portion is the batch.
Thus, the larger the batch size the more accurate the gradient estimate at each iteration.

TL;DR: batch size affect the accuracy of the estimated gradient at each iteration, changing the batch size therefore affect the "path" the optimization takes and may change the results of the training process.


Update:
In ICLR 2018 conference an interesting work was presented:
Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying, Quoc V. Le Don't Decay the Learning Rate, Increase the Batch Size.
This work basically relates the effect of changing batch size and learning rate.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...