CNN net much slower than DNN

guen_gn · August 20, 2021, 1:05am

I am a newbie and was going through examples in book ISBN9781492032649.
in page 297.py it uses MINST 28x28 images to train the neural network using regression model. (example 1 pasted below). This took about 167 seconds for 30 epochs on radeon GPU.

But same MNIST images are performy far more slowly on CNN network but i thought CNN would be much more faster and efficient?

EXAMPLE1:

Using neural net to do a classification task.

import tensorflow as tf
import pandas as pd
import matplotlib as plt

from tensorflow import keras
print(tf.version)
print(keras.version)

CONFIG_ENABLE_PLOT=0

fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
print("X_train_full.shape: ", X_train_full.shape)
print("X_train_full.dtype: ", X_train_full.dtype)

X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:]/255.0
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
X_test = X_test / 255.0
class_names = [“T-shirt/top”,“Trouser”, “Pullover”, “Dress”, “Coat” , “Sandal”, “Shirt”, “Sneaker”,“Bad”,“Ankle boot”]

model=keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape = [28, 28]))
model.add(keras.layers.Dense(300, activation=“relu”))
model.add(keras.layers.Dense(100, activation=“relu”))
model.add(keras.layers.Dense(30, activation=“softmax”))

print("model summary: ", model.summary())

model.compile(loss=“sparse_categorical_crossentropy”, optimizer=“sgd”, metrics=[“accuracy”])
history=model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

pd.DataFrame(history.history).plot(figsize=(8, 5))

if CONFIG_ENABLE_PLOT:
plt.pyplot.grid(True)
plt.pyplot.gca().set_ylim(0, 1)
plt.pyplot.show()

model.evaluate(X_test, y_test)

print("model layers: ", model.layers)
weights, biases = model.layers[1].get_weights()
print("weights, biases (shapes): ", weights, biases, weights.shape, biases.shape)
model.save(“p297.h5”)
X_new = X_test[:3]
y_proba = model.predict(X_new)
print(y_proba.round(2))

y_pred = model.predict_classes(X_new)
print("y_pred: ", y_pred)

EXAMPLE2:

Using CNN to do a classification task.

import tensorflow as tf
import pandas as pd
import matplotlib as plt
import time

from tensorflow import keras
print(tf.version)
print(keras.version)

CONFIG_ENABLE_PLOT=0

fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()

X_train_full = X_train_full.reshape(-1, 28, 28, 1)
print("X_train_full.shape: ", X_train_full.shape)
print("X_train_full.dtype: ", X_train_full.dtype)

X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:]/255.0
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
print("X_test shape: ", X_test.shape)
X_test = X_test.reshape(-1, 28, 28, 1)
X_test = X_test / 255.0
class_names = [“T-shirt/top”,“Trouser”, “Pullover”, “Dress”, “Coat” , “Sandal”, “Shirt”, “Sneaker”,“Bad”,“Ankle boot”]
‘’’
model=keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape = [28, 28]))
model.add(keras.layers.Dense(300, activation=“relu”))
model.add(keras.layers.Dense(100, activation=“relu”))
model.add(keras.layers.Dense(30, activation=“softmax”))
‘’’

model=keras.models.Sequential([
keras.layers.Conv2D(64, 7, activation=“relu”, padding=“same”, input_shape=[28, 28, 1]),
keras.layers.MaxPooling2D(2),
keras.layers.Conv2D(128, 3, activation=“relu”, padding=“same”),
keras.layers.Conv2D(128, 3, activation=“relu”, padding=“same”),
keras.layers.MaxPooling2D(2),
keras.layers.Conv2D(256, 3, activation=“relu”, padding=“same”),
keras.layers.Conv2D(256, 3, activation=“relu”, padding=“same”),
keras.layers.MaxPooling2D(2),
keras.layers.Flatten(),
keras.layers.Dense(128, activation=“relu”),
keras.layers.Dropout(0.5),
keras.layers.Dense(64, activation=“relu”),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation=“softmax”)
])

print("model summary: ", model.summary())

model.compile(loss=“sparse_categorical_crossentropy”, optimizer=“sgd”, metrics=[“accuracy”])
history=model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid))

pd.DataFrame(history.history).plot(figsize=(8, 5))

if CONFIG_ENABLE_PLOT:
plt.pyplot.grid(True)
plt.pyplot.gca().set_ylim(0, 1)
plt.pyplot.show()

model.evaluate(X_test, y_test)

print("model layers: ", model.layers)
weights, biases = model.layers[1].get_weights()
print("weights, biases (shapes): ", weights, biases, weights.shape, biases.shape)
model.save(“p297.h5”)
X_new = X_test[:3]
y_proba = model.predict(X_new)
print(y_proba.round(2))

y_pred = model.predict_classes(X_new)
print("y_pred: ", y_pred)

rsampath · August 20, 2021, 4:36am

Its running on CPU and that’s why its slow. Deep leanring models don’t support AMD Radeon GPU. They only work on Nvidia GPUs. This is the same for both Tensorflow and PyTorch.

Because the network is training using CPU, Deep models (with many CNN layers as here) will be slower than shallow networks.

I would suggest you try this in Google’s Colab environment. Please choose Runtime → Change RUntime as GPU in https://colab.research.google.com/

Bhack · August 20, 2021, 11:00am

It Is supported, see tensorflow-rocm · PyPI

rsampath · August 20, 2021, 11:53am

Thanks @Bhack. I didn’t realize that Radeon GPU’s are supported. @guen_gn - Did you install tensorflow through this package - tensorflow-rocm · PyPI

Also, is your OS one listed in the compatibility for AMD GPUs - https://rocmdocs.amd.com/en/latest/Current_Release_Notes/Current-Release-Notes.html#list-of-supported-operating-systems

guen_gn · August 20, 2021, 3:10pm

yes tensorflow-rocm installed.

guen_gn · October 3, 2021, 9:35am

GPU Radeon support question is shot down as candidate. Any other ideas?

Bhack · October 3, 2021, 9:49am

Have you compared the flops of the two models:

github.com/tensorflow/tensorflow

TF 2.0 Feature: Flops calculation

opened 09:54AM - 25 Sep 19 UTC

closed 10:21PM - 18 Jun 22 UTC

pzobel

stat:awaiting response type:feature stale comp:tfdbg

<em>Please make sure that this is a feature request. As per our [GitHub Policy](…https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md), we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em> **System information** - TensorFlow version (you are using): TF 2.0 RC2 - Are you willing to contribute it (Yes/No): **Describe the feature and the current behavior/state.** I am missing the opportunity to compute the number of floating point operations of a tf.keras Model in TF 2.0. In TF 1.x tf.profiler was available [see here](https://stackoverflow.com/questions/45085938) but I can find anything equivalent for TF 2.0 yet. **Will this change the current api? How?** **Who will benefit with this feature?** Everbody interested in the computational complexity of a TensorFlow model. **Any Other info.**

guen_gn · October 5, 2021, 4:00am

how do I do?
Just pass model to this:? get_flops(model)

Bhack · October 5, 2021, 1:23pm

E.g. you can use something like:

github.com/tensorflow/tensorflow

TF 2.0 Feature: Flops calculation

opened 09:54AM - 25 Sep 19 UTC

closed 10:21PM - 18 Jun 22 UTC

pzobel

stat:awaiting response type:feature stale comp:tfdbg

<em>Please make sure that this is a feature request. As per our [GitHub Policy](…https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md), we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em> **System information** - TensorFlow version (you are using): TF 2.0 RC2 - Are you willing to contribute it (Yes/No): **Describe the feature and the current behavior/state.** I am missing the opportunity to compute the number of floating point operations of a tf.keras Model in TF 2.0. In TF 1.x tf.profiler was available [see here](https://stackoverflow.com/questions/45085938) but I can find anything equivalent for TF 2.0 yet. **Will this change the current api? How?** **Who will benefit with this feature?** Everbody interested in the computational complexity of a TensorFlow model. **Any Other info.**

guen_gn · October 30, 2021, 5:54am

any other ideas on this?

Topic		Replies	Views
GPU is 4-5% slower than CPU TensorFlow models	2	857	July 31, 2023
Calculate Flops in Tensorflow and Pytorch are not equal? General Discussion github , models , python , pytorch	14	9720	December 22, 2023
Why `tf.keras.applications` is so slow? General Discussion api , keras , performance , help_request	1	871	July 7, 2021
Nan Loss during training - Tensorflow - MaskRCNN TensorFlow datasets	0	1090	January 11, 2023
Tensorflow 2.17 slow on apple silicon when training neural nets TensorFlow tfkeras	3	497	November 13, 2024

CNN net much slower than DNN

Using neural net to do a classification task.

Using CNN to do a classification task.

Related topics