How to reduce false positives and false negatives on train set in deep learning

Hi,

I have training a deep neural network for classification task on my machine learning dataset.

On train as well as test set below are the observations:

  1. For every true positive there are approx 3 false positive
  2. For approx 4 true negatives there is 1 false negatives

Below are observations while training.

382/382 [==============================] - 3s 9ms/step - loss: 0.6897 - tp: 84096.0000 - fp: 244779.0000 - tn: 355888.0000 - fn: 97448.0000 - accuracy: 0.5625 - precision: 0.2557 - recall: 0.4632 - auc: 0.5407 - prc: 0.2722 - val_loss: 0.6838 - val_tp: 19065.0000 - val_fp: 56533.0000 - val_tn: 91902.0000 - val_fn: 23829.0000 - val_accuracy: 0.5800 - val_precision: 0.2522 - val_recall: 0.4445 - val_auc: 0.5468 - val_prc: 0.2722

Can someone expert please help to let me know what can I do to minimise false classification on train as well as test set.

I am using imbalanced dataset with class_weight as shown in below code:-

METRICS = [
      keras.metrics.TruePositives(name='tp'),
      keras.metrics.FalsePositives(name='fp'),
      keras.metrics.TrueNegatives(name='tn'),
      keras.metrics.FalseNegatives(name='fn'), 
      keras.metrics.BinaryAccuracy(name='accuracy'),
      keras.metrics.Precision(name='precision'),
      keras.metrics.Recall(name='recall'),
      keras.metrics.AUC(name='auc'),
      keras.metrics.AUC(name='prc', curve='PR'), # precision-recall curve
]

pos = sum(y_train)
neg = y_train.shape[0] - pos
total = y_train.shape[0]

weight_for_0 = (1 / neg) * (total / 2.0)
weight_for_1 = (1 / pos) * (total / 2.0)
class_weight = {0: weight_for_0, 1: weight_for_1}

def make_model(size, layers, metrics=METRICS, output_bias=None):
    if output_bias is not None:
        output_bias = tf.keras.initializers.Constant(output_bias)
    model = keras.Sequential()
    model.add(keras.layers.Dense(size,input_shape=(window_length*indicators,)))
    model.add(keras.layers.Dropout(0.5))
    for i in range(layers-1):
        model.add(keras.layers.Dense(size))
        model.add(keras.layers.Dropout(0.5))
    model.add(keras.layers.Dense(1, activation = "sigmoid", bias_initializer=output_bias))
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=0.0001),
        loss=tf.keras.losses.BinaryCrossentropy(),
        metrics=metrics)
    return model


EPOCHS = 100
BATCH_SIZE = 2048

early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_prc', 
    verbose=1,
    patience=10,
    mode='max',
    restore_best_weights=True)

model = make_model(size = size,layers=layers, output_bias=np.log(pos/neg))    
history = model.fit(X_train,y_train, batch_size=BATCH_SIZE, epochs=EPOCHS,callbacks=[early_stopping],validation_data=(X_test, y_test_pos)
                            ,class_weight=class_weight
                            )

Can somebody please help

Can you share something more about your dataset. What Is your classification task?

Sorry, I cannot share much about the dataset features. The data is scaled with standardscaler and then clipped between -5 and 5.

I wanna know what can be the possible reasons for this.

If you can share anything else you could try to double check that the sample/label pairs that you pass to the network are correct, especially on the specific misclassified examples.

Then you could try to expand your model capacity until it is able to overfit your dataset.