Hello!
I’m trying to replicate a Tensorflow 1 experiment (TF 1.9) in a Tensorflow 2.5 ecosystem and I decided to use the keras implementation provided in tf.keras.applications.mobilenet_v2.MobileNetV2.
The original experiment used the google research implementation of mobilenetv2
Original TF1 MovilenetV2 source code:
As far I see this implementation uses L2 regularization checking the training scope in
( google-research/resolve_ref_exp_elements_ml/deeplab/mobilenet/mobilenet.py at master · google-research/google-research · GitHub )
def training_scope(is_training=True,
weight_decay=0.00004,
stddev=0.09,
dropout_keep_prob=0.8,
bn_decay=0.997):
“”"Defines Mobilenet training scope.
Usage:
with tf.contrib.slim.arg_scope(mobilenet.training_scope()):
logits, endpoints = mobilenet_v2.mobilenet(input_tensor)
# the network created will be trainble with dropout/batch norm
# initialized appropriately.
Args:
is_training: if set to False this will ensure that all customizations are
set to non-training mode. This might be helpful for code that is reused
across both training/evaluation, but most of the time training_scope with
value False is not needed. If this is set to None, the parameters is not
added to the batch_norm arg_scope.
weight_decay: The weight decay to use for regularizing the model.
stddev: Standard deviation for initialization, if negative uses xavier.
dropout_keep_prob: dropout keep probability (not set if equals to None).
bn_decay: decay for the batch norm moving averages (not set if equals to
None).
Returns:
An argument scope to use via arg_scope.
“”"
‘’‘’
Note: do not introduce parameters that would change the inference
model here (for example whether to use bias), modify conv_def instead.
‘’‘’
batch_norm_params = {‘decay’: bn_decay, ‘is_training’: is_training}
if stddev < 0:
weight_intitializer = slim.initializers.xavier_initializer()
else:
weight_intitializer = tf.truncated_normal_initializer(stddev=stddev)
‘’‘’
Set weight_decay for weights in Conv and FC layers.
‘’‘’
with slim.arg_scope(
[slim.conv2d, slim.fully_connected, slim.separable_conv2d],
weights_initializer=weight_intitializer,
normalizer_fn=slim.batch_norm),
slim.arg_scope([mobilenet_base, mobilenet], is_training=is_training),
safe_arg_scope([slim.batch_norm], **batch_norm_params),
safe_arg_scope([slim.dropout], is_training=is_training,
keep_prob=dropout_keep_prob),
slim.arg_scope([slim.conv2d],
weights_regularizer=slim.l2_regularizer(weight_decay)),
slim.arg_scope([slim.separable_conv2d], weights_regularizer=None) as s:
return s
I’m not very familiar with the slim module (and it is hard to find documentation about it) but as far as I understand that all layers under the decorator @slim.add_arg_scope will use the provided arguments.
If i’m right it would mean that all conv2d layers apply the L2 regularization.
Looking at the source of the TF2 Keras implementation this regularization is not applied in any conv2d layers.
(tensorflow/tensorflow/python/keras/applications/mobilenet_v2.py at v2.5.0 · tensorflow/tensorflow · GitHub)
I would like to know if these two implementation are really equivalent and If there is someone who knows the reason behind the absence of the L2 regularization in the keras implementation.
Thanks in advance!