Why training model with (Batch, 90, 7) Is slower than (Batch, 90, 8) ?)

SAL · June 27, 2024, 6:45am

Using tensorflow
I am using the following model:

model = Sequential()
model.add(Masking(mask_value=0.0, input_shape=(90, 7)))
model.add(LSTM(100, return_sequences=True))
model.add(LSTM(70, return_sequences=True))
model.add(LSTM(70, return_sequences=False)) 
model.add(Dense(20, activation='relu'))
model.add(Dense(22, activation='softmax'))

When the first layer is:
model.add(Masking(mask_value=0.0, input_shape=(90, 7)))
the training time (for each epoch) is slower then when the first layer is:
model.add(Masking(mask_value=0.0, input_shape=(90, 8)))

It seems that when the input is larger the GPU handle it faster.

why is it ?
Is it better to use model.add(Masking(mask_value=0.0, input_shape=(90, 8))) and set the last dim of the input with zeros ?

aniruthraj · October 22, 2024, 6:11pm

Hi @SAL,

Apologies for the late reply.

GPUs are more efficient and faster with larger input sizes due to their ability to perform many operations simultaneously through parallel processing.They also stabilises the gradients and improves convergence when we are using larger datasets.

Yes, you can use model.add(Masking(mask_value=0.0, input_shape=(90, 8))) with input zeros for efficiency, but ensure that it doesn’t impact your model training by validating loss.

Thank You.

Topic		Replies	Views
Processing speed with two outputs is faster than with one output in GPU? General Discussion gpu	2	225	December 20, 2023
Is it normal to have zero performance gains when processing grayscale? General Discussion models	6	425	July 21, 2023
CNN Training performance improvements TensorFlow cnn , epoc , python	3	44	September 24, 2024
GPU is 4-5% slower than CPU TensorFlow models	2	852	July 31, 2023
Multi GPU and TensorFlow MirroredStrategy General Discussion distributed-training , help_request	1	652	October 4, 2024

Why training model with (Batch, 90, 7) Is slower than (Batch, 90, 8) ?)

Related topics