So, I think it'd clear most of this up if we were to step back and discuss the role the bias unit is meant to play in a NN.
A bias unit is meant to allow units in your net to learn an appropriate threshold (i.e. after reaching a certain total input, start sending positive activation), since normally a positive total input means a positive activation.
For example if your bias unit has a weight of -2 with some neuron x, then neuron x will provide a positive activation if all other input adds up to be greater then -2.
So, with that as background, your answers:
- No, one bias input is always sufficient, since it can affect different neurons differently depending on its weight with each unit.
- Generally speaking, having bias weights going to every non-input unit is a good idea, since otherwise those units without bias weights would have thresholds that will always be zero.
- Since the threshold, once learned should be consistent across trials. Remember the bias represented how each unit interacts with the input; it isn't an input itself.
- You certainly can and many do. Any sqaushing function generally works as an activation function.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…