Need help with Traffic Sign Detection training

brown_sloth · January 30, 2022, 11:05am

Hi Folks.
I have some queries specific to the TF Object Detection API.
Basically, I am trying to fine-tune a SSDMobilenetV2 from the TF OD API model zoo to detect traffic signs, so first I fine-tuned on GTSDB (around 500 samples) and the resulting model wasnt very good.

Then, I tried to train on an augmented version(using rotation, translation, shearing etc.) of GTSDB (4500 samples). During training this time after about 9k iterations, the validation loss starts increasing and never comes back down.

I am assuming that implies overfitting, which according to me could be due to:

train and eval data being very different – i checked and this isnt the case
learning rate too high – i reduced it by a factor of 10 and still overfitting occurs
model might be too complex – the same model was getting properly trained on un-augmented GTSDB with only 500 train samples, so this model shouldnt be too complex for the augmented GTSDB which has around 4500 samples
Augmented dataset might not have been properly created – I converted all the annotated images into a video and checked that manually, the dataset seems fine

I am trying to think of other reasons and would appreciate any help in that regard.
Note: I used imgaug library for data augmentation.
For reference, I have attached my loss curves and config file

Config file:

model {
  ssd {
    inplace_batchnorm_update: true
    freeze_batchnorm: false
    num_classes: 9
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    encode_background_as_zeros: true
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        class_prediction_bias_init: -4.6
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            random_normal_initializer {
              stddev: 0.01
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.97,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v2_keras'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.97,
          epsilon: 0.001,
        }
      }
      override_base_feature_extractor_hyperparams: true
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
          alpha: 0.75,
          gamma: 2.0
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          delta: 1.0
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    normalize_loc_loss_by_codesize: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  fine_tune_checkpoint_version: V2
  fine_tune_checkpoint: "./sprint1_ssd_mobilenetv2_try2/pretrained_model/mobilnetv2/checkpoint/ckpt-0"
  fine_tune_checkpoint_type: "detection"
  batch_size: 24
  sync_replicas: true
  startup_delay_steps: 0
  replicas_to_aggregate: 8
  num_steps: 50000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: .0008
          total_steps: 50000
          warmup_learning_rate: 0.00013333
          warmup_steps: 1000
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
}

train_input_reader: {
  label_map_path: "./sprint1_ssd_mobilenetv2_try2/label_map.pbtxt"
  tf_record_input_reader {
    input_path: "./sprint1_ssd_mobilenetv2_try2/gtsdb_stop_train_9.record"
  }
}

eval_config: {
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
  batch_size: 1
}

eval_input_reader: {
  label_map_path: "./sprint1_ssd_mobilenetv2_try2/label_map.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "./sprint1_ssd_mobilenetv2_try2/gtsdb_stop_val_9.record"
  }
}

Bhack · January 31, 2022, 1:13pm

Have you tried to add another validation dataset on the same not augmented training when you train on augmented version?

brown_sloth · January 31, 2022, 1:31pm

I augmented the GTSDB in one go, and then split it 90/10 train/test with random shuffling. Is this your question?

Bhack · January 31, 2022, 2:01pm

Yes can you split e.g. 70/30? Have you shuffled correctly your data?

brown_sloth · January 31, 2022, 2:16pm

I tried with 80/20 instead of 90/10 but didnt see much difference in loss curves.
As for shuffling across test and train sets, I checked the number of samples per class for the 9 classes:

For test set:

{'0': 96,
 '1': 81,
 '2': 89,
 '3': 96,
 '4': 97,
 '5': 104,
 '6': 102,
 '7': 125,
 '8': 148}

For train set:

{'0': 406,
 '1': 319,
 '2': 316,
 '3': 352,
 '4': 443,
 '5': 408,
 '6': 516,
 '7': 454,
 '8': 512}

Seems appropriately shuffled to me, I am not sure if there are other things I should analyse.
P.S: The annotations count per class for the original un-augmented dataset:

Bhack · January 31, 2022, 2:26pm

Can you test the performance on the not augmented training set?

brown_sloth · January 31, 2022, 4:29pm

I tested on around 50 samples on the original un-augmented data, and the model is able to detect most of signs properly (usually with confidence of 90% or above as compared to my earlier models trained on original dataset which almost never had a confidence above 50%) at least the ones clearly visible to the naked eye.

Bhack · January 31, 2022, 4:59pm

Have you tried to reduce your augmentation variance?

E.g. You could start to create a first train/val dataset reducing the augmentation hyperparamters rangers.

If It work well you could try to extend the range a bit and so on.

brown_sloth · January 31, 2022, 5:42pm

Thats what I have been trying today all day (removed certain augmentations like cropping and reduced others like rotation angle). During the current training, the model accuracy on training set seems to be increasing fast but not as fast as when I first trained it, so I guess doing aforementioned has slowed down or delayed overfitting, but I fear it’s still going to occur.

I’ll still wait for the current training to complete before testing the model.

Thanks.

Bhack · January 31, 2022, 6:01pm

You really need to check that you have uniform sampled between the train and eval to cover the same augmentation hyperparams range.

I think also that a 500 sample dataset it could be a little bit small also with augmentation.

Topic		Replies	Views
Image classification_loss not imporving General Discussion models , datasets , object-detection , help_request	0	1063	November 5, 2021
Accuracy and Validation Accuracy don't increase TensorFlow tfkeras , model	1	59	November 4, 2024
mAP is -1 for custom object detection TensorFlow models , datasets	9	1399	November 27, 2022
Model Overfiting with LSTM layers General Discussion models , help_request	27	1741	July 28, 2021
Object detecion: INVALID_ARGUMENT: required broadcastable shapes while training General Discussion model_garden , tpu , help_request	1	1800	June 12, 2024

Need help with Traffic Sign Detection training

Related topics