Default weight initialization

ririya · July 9, 2021, 1:57am

I’m wondering why all Keras layers use Glorot initialization as default. Since Relu is the most popular activation function, shouldn’t He be the default initialization?

The prebuilt application models such as ResNet50, also use Glorot initialization as default and there is no parameter to pass and modify it.

Bhack · July 9, 2021, 3:56am

Are you talking about this:

ririya · July 9, 2021, 5:05pm

Exactly! In my case I’m using the default ResNet50, trained from scratch and the network is training and converging. My inputs have an arbitrary number of channels that’s why I cannot use ImageNet weights. However, I’m wondering if initialization with He method would improve the results. I noticed a big difference in overfitting rom run to run depending on the initials weights from each run.

Bhack · July 9, 2021, 6:02pm

@markdaoust Do you know the history of this default?

Mark_Daoust · July 9, 2021, 6:37pm

Interesting, I wonder how they trained the VGG19 in keras.applications

Here it is in mid 2016:

github.com

keras-team/keras/blob/90d0eb9b88c5ef6f756574f60d314c0aa7916f2c/keras/layers/convolutional.py#L369


      
              `(samples, rows, cols, channels)` if dim_ordering='tf'.
          
          # Output shape
              4D tensor with shape:
              `(samples, nb_filter, new_rows, new_cols)` if dim_ordering='th'
              or 4D tensor with shape:
              `(samples, new_rows, new_cols, nb_filter)` if dim_ordering='tf'.
              `rows` and `cols` values might have changed due to padding.
          '''
          def __init__(self, nb_filter, nb_row, nb_col,
                       init='glorot_uniform', activation='linear', weights=None,
                       border_mode='valid', subsample=(1, 1), dim_ordering='default',
                       W_regularizer=None, b_regularizer=None, activity_regularizer=None,
                       W_constraint=None, b_constraint=None,
                       bias=True, **kwargs):
              if dim_ordering == 'default':
                  dim_ordering = K.image_dim_ordering()
              if border_mode not in {'valid', 'same'}:
                  raise Exception('Invalid border mode for Convolution2D:', border_mode)
              self.nb_filter = nb_filter
              self.nb_row = nb_row

It’s probably one of those things that got set at one point when it made sense and then got locked in by backwards compatibility guarantees.

Aside from updating the keras.applications to allow initializers as arguments. Annother possible solution would be for keras to implements a global “default_initializer” or something like that. Either one would take some work.

Bhack · July 9, 2021, 7:26pm

But if I remember correctly something similar didn’t pass in 2019:

github.com/tensorflow/tensorflow

Changing the default initializer globally for tf.keras.layers

opened 10:58AM - 02 Mar 19 UTC

closed 10:27PM - 08 Mar 19 UTC

guillaumekln

stat:awaiting tensorflower type:feature comp:ops TF 2.0

**System information** - TensorFlow version: 2.0 - Are you willing to contribu…te it: No **Describe the feature and the current behavior/state.** For V1 `tf.layers`, one could set a default initializer to the global variable scope: ```python with tf.variable_scope(name, initializer=...): ... ``` In V2 `tf.keras.layers`, the default initializer is hardcoded to `initializers.glorot_uniform()`. I currently see 2 imperfect ways to change the default initializer globally: * change the initializer to every layers that are created **but** this is not convenient for large models * implement a manual assignation loop **but** AFAIK there is no clean way to detect if the variable initializer is the default one or an initializer that was explicitly passed by the developer (typically a constant initialization). **Will this change the current api? How?** I think the most user friendly approach to this is to add an endpoint to globally change the default initializer: each layer created after this statement will use this initializer by default. **Who will benefit with this feature?** Anyone who wants to easily apply a global initialization strategy while still ensuring that specific initializer can be set locally. --- Any recommendation to achieve this with the current code is of course appreciated.

Mark_Daoust · July 9, 2021, 8:07pm

didn’t pass in 2019:

Ah, so scratch that one. Thanks.

Bhack · July 9, 2021, 8:19pm

…Or as team changes in 2021 we could have a different evaluation.

Topic		Replies	Views
Dense Layer Initialization does not seems Glorot Uniform General Discussion bug , keras , api	1	639	November 7, 2022
Facing issue while initializing parameters in tensorflow General Discussion help_request , keras , api	1	604	September 3, 2021
How to make initializers i.i.d General Discussion keras , api	1	429	August 19, 2024
Layer Components and calling sequence General Discussion keras , help_request , api	1	403	August 3, 2021
Tensorflow Base and Keras Model Different Results General Discussion keras , models	0	478	May 3, 2023

Default weight initialization

Related topics