ValueError: Dimensions 100 and 0 are not compatible

Hello, I was following the tutorial on Deep Learning for Computer Vision with Python and TensorFlow – Complete Course on FreeCodeCamp and while I’m doing the project Car Price Prediction, I just encountered an error and tried to solve it but failed.

Here is the code

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import pandas as pd
import seaborn as sns
from keras.layers import Normalization, Dense, InputLayer
from keras.losses import MeanSquaredError, Huber, MeanAbsoluteError
from keras.optimizers import Adam
from keras.metrics import RootMeanSquaredError

data = pd.read_csv("/Users/atatekeli/PycharmProjects/CompVision/comp-vision-projects/tensorflow-comp-vision/Car Price Prediction/carprice.csv")
print(data.head())
print(data.shape)
print(data.describe())

sns.pairplot(data[['years', 'km', 'rating', 'condition', 'economy', 'top speed', 'hp', 'torque', 'current price']], diag_kind='kde')

tensor_data = tf.constant(data)
tensor_data = tf.cast(tensor_data, tf.float32)
print(tensor_data)

tensor_data = tf.random.shuffle(tensor_data)
print(tensor_data[:5])

X = tensor_data[:,3:-1]
print(X[:5])

y = tensor_data[:,-1]
print(y[:5].shape)
y = tf.expand_dims(y, axis = -1)
print(y[:5])

normalizer = Normalization(axis = -1, mean = 5, variance = 4)
x_normalized = tf.constant([[3,4,5,6,7],
                            [4,5,6,7,8]])
normalizer(x_normalized)

normalizer = Normalization()
x_normalized = tf.constant([[3,4,5,6,7],
                            [4,10,6,7,8],
                            [32,1,56,3,5]])
normalizer.adapt(x_normalized)
normalizer(x_normalized)

print(X.shape)

TRAIN_RATIO = 0.8
VAL_RATIO = 0.1
TEST_RATIO = 0.1
DATASET_SIZE = len(X)

X_train = X[:int(DATASET_SIZE*TRAIN_RATIO)]
y_train = y[:int(DATASET_SIZE*TRAIN_RATIO)]
print(X_train.shape)
print(y_train.shape)

train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size = 8, reshuffle_each_iteration = True).batch(32).prefetch(tf.data.AUTOTUNE)

for x,y in train_dataset:
  print(x,y)
  break

X_val = X[int(DATASET_SIZE*TRAIN_RATIO):int(DATASET_SIZE*(TRAIN_RATIO+VAL_RATIO))]
y_val = y[int(DATASET_SIZE*TRAIN_RATIO):int(DATASET_SIZE*(TRAIN_RATIO+VAL_RATIO))]
print(X_val.shape)
print(y_val.shape)

val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val))
val_dataset = train_dataset.shuffle(buffer_size = 8, reshuffle_each_iteration = True).batch(32).prefetch(tf.data.AUTOTUNE)

X_test = X[int(DATASET_SIZE*(TRAIN_RATIO+VAL_RATIO)):]
y_test = y[int(DATASET_SIZE*(TRAIN_RATIO+VAL_RATIO)):]
print(X_test.shape)
print(y_test.shape)

test_dataset = tf.data.Dataset.from_tensor_slices((X_test, y_test))
test_dataset = train_dataset.shuffle(buffer_size = 8, reshuffle_each_iteration = True).batch(32).prefetch(tf.data.AUTOTUNE)

normalizer = Normalization()
normalizer.adapt(X_train)
print(normalizer(X)[:5])

print(X[:5])

"""## **Model Creation and Training**"""

model = tf.keras.Sequential([
                             InputLayer(input_shape = (8,)),
                             normalizer,
                             Dense(128, activation = "relu"),
                             Dense(128, activation = "relu"),
                             Dense(128, activation = "relu"),
                             Dense(1),
])
print(model.summary())

tf.keras.utils.plot_model(model, to_file = "model.png", show_shapes=True)

model.compile(optimizer = Adam(learning_rate = 0.1),
              loss = MeanAbsoluteError(),
              metrics = RootMeanSquaredError())

history = model.fit(train_dataset, validation_data=val_dataset, epochs = 100, verbose = 1)

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val_loss'])
plt.show()

plt.plot(history.history['root_mean_squared_error'])
plt.plot(history.history['val_root_mean_squared_error'])
plt.title('model performance')
plt.ylabel('rmse')
plt.xlabel('epoch')
plt.legend(['train', 'val'])
plt.show()

"""## **Model Evaluation and Testing**"""

model.evaluate(X_test,y_test)
print(X_test.shape)

model.predict(tf.expand_dims(X_test[0], axis = 0 ))

print(y_test[0])
y_true = list(y_test[:,0].numpy())

y_pred = list(model.predict(X_test)[:,0])
print(y_pred)

ind = np.arange(100)
plt.figure(figsize=(40,20))

width = 0.1

plt.bar(ind, y_pred, width, label='Predicted Car Price')
plt.bar(ind + width, y_true, width, label='Actual Car Price')

plt.xlabel('Actual vs Predicted Prices')
plt.ylabel('Car Price Prices')

plt.show()

Hi @Ata_Tekeli, Could you please let us know what is the shape of your data? Thank You.

Which type of data, training, test or validation or prediction set is more important

Hi @Ata_Tekeli, The data shape used for training. If possible could you please share the data set you are using in the shared drive. Thank You.

Training data shapes: (100, 8), (0, 1)

For dataset: SECOND HAND CARS DATA SET | REGRESSION | Kaggle

This is the first time I’ve been exposed to that error, tried to look but I couldn’t find anything meaningful

Hi @Ata_Tekeli, The error is due to the X_val is of shape (100,8) and y_val is of shape (0,1). while using val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val)) the error occurs due to the shape mismatch. Please refer to this gist for working code example. Thank You.

I’ll have a look and try to fix it from here, it was my first time being exposed of and it became an outstanding lesson, if something goes wrong, I’ll let you know.

I got this at

X_val = X[int(DATASET_SIZETRAIN_RATIO):int(DATASET_SIZE(TRAIN_RATIO+VAL_RATIO))]
y_val = y[int(DATASET_SIZETRAIN_RATIO):int(DATASET_SIZE(TRAIN_RATIO+VAL_RATIO))]
print(X_val.shape)
print(y_val.shape)

Instead of this (100, 8)
(100, 1) I got (100, 8)
(0, 1)

Worked really well, thank you

Hey Brother, I got the same problem as you, can you please explain me that how you get the shape of the dataset from (100, 8)
(0, 1) to(100, 8)
(100, 1)