i have used the tensorflow book example, but concatenated version of NN fron two different input is output NaN. There is second simpler similar code in which single input is separated and concatenated back which works. I was wondering what is wrong with 2 input code below such that it is outputting NaN???
Here is the code that is output NaN from the output layer (As a debugging effort, I put second code much simpler far below that works.
In brief, here the training layers flow goes like from the code below:
inputA-> → (to concat layer)
inputB->hidden1->hidden2-> (to concat layer) →
concat → output
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow import keras
housing = fetch_california_housing()
X_train_full, X_test, y_train_full, y_test = train_test_split(housing.data, housing.target)
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full)
print("X_train/y_train/full/test shapes: ", X_train_full.shape, X_test.shape, y_train_full.shape, y_test.shape)
print("X_/y_/train/valid: ", X_train.shape, X_valid.shape, y_train.shape, y_valid.shape)
X_train_A, X_train_B = X_train[:, :5], X_train[:, 2:]
X_valid_A, X_valid_B = X_valid[:, :5], X_valid[:, 2:]
X_test_A, X_test_B = X_test[:, :5], X_test[:, 2:]
X_new_A, X_new_B = X_test_A[:3], X_test[:3, 2:]
print("X_train_A/B/valid_A/B: ", X_train_A.shape, X_train_B.shape, X_valid_A.shape, X_valid_B.shape)
print("X_test_A/B/new_A/B: ", X_test_A.shape, X_test_B.shape, X_new_A.shape, X_new_B.shape)
scaler=StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)
input_A = keras.layers.Input(shape=[5] , name=“wide_input”)
input_B = keras.layers.Input(shape=[6] , name=“deep_input”)
hidden1 = keras.layers.Dense(30, activation=“relu”)(input_B)
hidden2 = keras.layers.Dense(30, activation=“relu”)(hidden1)
concat = keras.layers.Concatenate()([input_A, hidden2])
output = keras.layers.Dense(1, name=“output”)(concat)
model = keras.Model(inputs=[input_A, input_B], outputs=[output])
model.compile(loss=“mse”, optimizer=keras.optimizers.SGD(lr=1e-3))
history = model.fit(
(X_train_A, X_train_B), y_train,
epochs=20,
validation_data=((X_valid_A, X_valid_B),y_valid)
)
print("training result (shape): ", history)
mse_test = model.evaluate((X_test_A, X_test_B), y_test)
y_pred = model.predict((
X_new_A,
X_new_B))
model.save(“p310.h5”)
output:
11610/11610 [==============================] - 1s 58us/sample - loss: nan - val_loss: nan
Epoch 2/20
11610/11610 [==============================] - 0s 35us/sample - loss: nan - val_loss: nan
Epoch 3/20
11610/11610 [==============================] - 0s 37us/sample - loss: nan - val_loss: nan
Epoch 4/20
Working code:
NN layers:
input_ → hidden1->hidden2-> (to concat layer)
input_ → (to concat layer)
concat ->output
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow import keras
housing = fetch_california_housing()
X_train_full, X_test, y_train_full, y_test = train_test_split(housing.data, housing.target)
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full)
print(“training data shapes:”)
print("X_train_full/test/t_train_full/test: ", X_train_full.shape, X_test.shape, y_train_full.shape, y_test.shape)
print("X_train/X_valid/y_train/y_valid: ", X_train.shape, X_valid.shape, y_train.shape, y_valid.shape)
scaler=StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)
input_ = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = keras.layers.Dense(30, activation=“relu”)(input_)
hidden2 = keras.layers.Dense(30, activation=“relu”)(hidden1)
concat = keras.layers.Concatenate()([input_, hidden2])
output = keras.layers.Dense(1)(concat)
model = keras.Model(inputs=[input_], outputs=[output])
model.compile(loss=“mse”, optimizer=keras.optimizers.SGD(lr=1e-3))
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_valid, y_valid))
print("training result (shape): ", history)
mse_test = model.evaluate(X_test, y_test)
X_new = X_test[:3] # prertend these are new instances.
y_preid = model.predict(X_new)
output:
11610/11610 [==============================] - 1s 54us/sample - loss: 1.8919 - val_loss: 0.8798
Epoch 2/20
11610/11610 [==============================] - 0s 32us/sample - loss: 0.8452 - val_loss: 0.7558
Epoch 3/20
11610/11610 [==============================] - 0s 34us/sample - loss: 0.7188 - val_loss: 0.6991
Epoch 4/20
11610/11610 [==============================] - 0s 32us/sample - loss: 0.6705 - val_loss: 0.6597
Epoch 5/20