Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
234 views
in Technique[技术] by (71.8m points)

python - Tensorflow simple fit line producess NaN loss

I'm new in using tensorflow. I'm simply trying to get the relationship between x and y values. Their relationship is described by the equation: y = 50 + 50*x. See code below.

import tensorflow as tf
import numpy as np
from tensorflow import keras

# GRADED FUNCTION: house_model
def house_model(y_new):

    xs = np.array([0, 1,2,3,4,5], dtype = float)
    ys = np.array([50000, 100000,150000,200000,250000,300000], dtype = float)

    #ys = ys/100000
    
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(128))
    model.add(tf.keras.layers.Dense(128))
    
    model.compile(optimizer = "sgd", loss = "mean_squared_error", metrics = ["accuracy"])
    
    model.fit(xs, ys, epochs = 500)

    return model.predict([y_new])

prediction = house_model([7.0])
print(prediction)

My question is why does the loss became NaN and accuracy became 0.0000e+00 if I don't rescale ys? (I commented out the rescaling in the code above). Based from what I found in the internet, you should normalized the input to have accurate results but in this case it is the output I'm normalizing to get good results. Can someone please explain to be why is this happening?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It fails since your model doesn't have an output layer. Your model outputs an array of size 128 (since that's your last layer), but when calling "fit" you try fitting x to y, which are both 1d arrays. Your model should be like this:

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(128))
model.add(tf.keras.layers.Dense(128))
model.add(tf.keras.layers.Dense(1)) # output layer

model.compile(optimizer = "sgd", loss = "mean_squared_error", metrics = ["accuracy"])
   

But still, this may again produce undesired results like None loss. As @Lescurel pointed out in the comment, your gradients can explode or vanish in which case it's possible to see undesired results.

In this case, your gradients simply explode (becomes very large), and hence your model quickly diverges (maybe even after 1 epoch) and "not learning".

Further readings:


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...