How to improve accuracy of a CNN_LSTM binary classifier in TF 2.4

granth_jain · October 1, 2021, 12:09pm

I am trying to build a CNN LSTM classifier for 1d sequential data.Input is of length 20 and contains 4 features.

I have trained the model and saved it. However I am unable to get good performance in both training as well as test data:-

Below is my code for the tensorflow model.

model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(filters=128, kernel_size=8, padding = 'same', activation='relu', input_shape = (20,4)))
model.add(tf.keras.layers.Conv1D(filters=128, kernel_size=5, padding = 'same', activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(tf.keras.layers.Conv1D(filters=128, kernel_size=3, padding = 'same', activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(tf.keras.layers.MaxPooling1D(pool_size=2))
model.add(tf.keras.layers.LSTM(units = 128))
model.add(tf.keras.layers.Dense(units = 1, activation = 'sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics = 'accuracy')
model.build()
model.summary()
history = model.fit(X_tf, y_tf, epochs=60, batch_size=256, validation_data = (X_tf_,y_tf_))

Here are the logs that I am getting while training.

Epoch 5/60 19739/19739 [==============================] - 1212s 61ms/step - loss: 0.5858 - accuracy: 0.7055 - val_loss: 0.5854 - val_accuracy: 0.7062

I need help in how can I further improve the performance.What are the various techniques that I can apply to sequential data?

My training dataset has 4.8 million rows and test set has 1.2 million rows.

Ekaterina_Dranitsyna · October 2, 2021, 8:46am

You can make the model bigger: add more LSTM layers, increase the number of units in the layers, make them bidirectional, add dense layers with activations after the last LSTM or experiment with other architectures.
Other way is to change the number of epochs, batch size and learning rate and see how it affects the results.
If nothing helps, check for class imbalance and how both classes are distributed between the train and validation sets. Apply some basic techniques for imbalances data like using sample weights and generating synthetic data for underrepresented class.
Add more features, if it is possible, or generate new features from existing ones.

granth_jain · November 18, 2021, 2:17pm

I have tried balancing the dataset as well as increasing the size and depth of model.However I am unable to get success in this problem.

I have posted the data set on kaggle here: https://www.kaggle.com/jgranth/binary-seqclassification-of-input-shape209

Can someone please help in how can I solve this problem with good accuracy.

The problems I am facing:

Without resampling: majority class prevails and very less minority class.
With resampling: I am getting many false positives on train and test data.

Topic		Replies	Views
Unable to get good accuracy in sequence classification General Discussion kaggle , help_request	1	1124	December 22, 2021
How to reduce false negatives in model General Discussion models , keras , help_request	1	707	February 16, 2023
How improve neural network General Discussion models , learning , help_request	4	1253	September 9, 2021
Training loss not decreasing enough even after increasing the the model size General Discussion models , keras , performance	1	4028	March 8, 2023
The accuracy do not increase when I train model General Discussion models , keras , help_request	2	987	February 20, 2023

How to improve accuracy of a CNN_LSTM binary classifier in TF 2.4

Related topics