You can change the iter_size
in the solver parameters.
Caffe accumulates gradients over iter_size
x batch_size
instances in each stochastic gradient descent step.
So increasing iter_size
can also get more stable gradient when you cannot use large batch_size due to the limited memory.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…