It's kind of stander problem of NaN
in training, I suggest you read this answer about issue NaN
with Adam solver for the cause and solution in common case.
Basically I just did following two change and code running without NaN
in gradients:
Reduce the learning rate in optimizer at model.compile
to optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3)
,
Replace the C = [loss(label,pred) for label, pred in zip(yBatchTrain,dumbModel(dataNoised,training=False))]
to C = loss(yBatchTrain,dumbModel(dataNoised,training=False))
If you still have this kind of error then the next few thing you could try is:
- Clip the loss or gradient
- Switch all tensor from
tf.float32
to tf.float64
Next time when you facing this kind of error, you could using tf.debugging.check_numerics to find root cause of the NaN
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…