I am a little confused about how should I use/insert "BatchNorm"
layer in my models.
I see several different approaches, for instance:
ResNets: "BatchNorm"
+"Scale"
(no parameter sharing)
"BatchNorm"
layer is followed immediately with "Scale"
layer:
layer {
bottom: "res2a_branch1"
top: "res2a_branch1"
name: "bn2a_branch1"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "res2a_branch1"
top: "res2a_branch1"
name: "scale2a_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
In the cifar10 example provided with caffe, "BatchNorm"
is used without any "Scale"
following it:
layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
cifar10 Different batch_norm_param
for TRAIN
and TEST
batch_norm_param: use_global_scale
is changed between TRAIN
and TEST
phase:
layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
batch_norm_param {
use_global_stats: false
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
include {
phase: TRAIN
}
}
layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
batch_norm_param {
use_global_stats: true
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
include {
phase: TEST
}
}
So what should it be?
How should one use"BatchNorm"
layer in caffe?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…