I am training a numerical dataset for predicting a particular output and while my training dataset returns the prediction properly, when I import data from somewhere else to test the prediction, it returns Nan. Can someone help me out here?
Please find my code below:
early_stopping = callbacks.EarlyStopping(
min_delta=0.001, # minimium amount of change to count as an improvement
patience=50, # how many epochs to wait before stopping
restore_best_weights=True,
monitor='mae',
)
model = keras.Sequential([
layers.BatchNormalization(input_shape = input_shape),
layers.Dense(512, activation = 'relu'),
layers.BatchNormalization(),
layers.Dense(512, activation = 'relu'),
layers.BatchNormalization(),
layers.Dense(512, activation = 'relu'),
layers.BatchNormalization(),
layers.Dense(512, activation = 'relu'),
layers.BatchNormalization(),
layers.Dense(1),
])
model.compile(
optimizer='adam',
loss='mae',
metrics=['mae'],
)
X=np.asarray(X).astype(np.float32)
y=np.asarray(y).astype(np.float32)
fitModel = model.fit(
X, y,
epochs = 100,
callbacks = [early_stopping],
verbose=0
)
model.save('testKeras.h5')
loadModel = keras.models.load_model('testKeras.h5')
loadModel.predict(X[:2])
Output:
array([[52.616314],
[51.21798 ]], dtype=float32)
Now I loaded a new dataset into the system to check the prediction results
testResult = pd.DataFrame(PostGresProduction(queryCheck), columns=['gender', 'age', 'mobile_number_count', 'mobile_registered_with_bureau',
'state_id', 'city_id', 'loan_period', 'repayment_period',
'moratorium_availed', 'is_married', 'is_spouse_working',
'no_of_children',
'is_joint_family', 'is_migrant', 'other_asset', 'is_political',
'is_police',
'is_lawyer', 'has_gst', 'industry_type', 'business_type',
'billing_mode',
'daily_sales', 'nature_of_invoicing', 'business_experience',
'online_banking', 'profit_margin', 'business_nature', 'taxpayer_type',
'filing_frequency', 'credit_bureau_score', 'loans_defaulted',
'loans_taken',
'loans_writtenoff', 'emi_left', 'min_repayment_amount'])
testResult = testResult.iloc[0]
testResult = pd.DataFrame([testResult])
testResult = np.asarray(testResult)
testResult.shape
Output: (1, 36)
loadModel.predict(testResult)
Output: array([[nan]], dtype=float32)
testResult
Output:
array([[ 0.00e+00, 4.40e+01, 2.00e+00, 0.00e+00, 2.00e+00, 7.00e+00,
3.00e+01, 7.00e+00, 0.00e+00, 1.00e+00, 1.00e+00, 4.00e+00,
1.00e+00, 0.00e+00, nan, 0.00e+00, 0.00e+00, 0.00e+00,
1.00e+00, 4.40e+01, nan, 1.23e+02, 5.70e+01, 2.88e+02,
nan, 1.00e+00, 2.00e+01, 3.50e+01, 1.00e+00, 0.00e+00,
-1.00e+00, 0.00e+00, 0.00e+00, 0.00e+00, 0.00e+00, 2.50e+04]],
dtype=float32)
EDIT:
I tried removing the Null values in my dataset and it returned a value on predict. But that value is not right. It returns a four digit number while on the training dataset it rightly returns a two digit number.
Can someone help me here?
Output right now: array([[2462.3406]], dtype=float32)
Expected Outpute: ~array([[30]], dtype=float32)