Sorry I am new to deep learning and keras. I am trying to define a layer myself.
I looked into the keras document, The base Layer class
class SimpleDense(Layer):
def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape): # Create the state of the layer (weights)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
def call(self, inputs): # Defines the computation from inputs to outputs
return tf.matmul(inputs, self.w) + self.b
# Instantiates the layer.
linear_layer = SimpleDense(4)
I understand when I create linear_layer
, the __init__
method is called, and when I put inputs into linear_layer
, the call
method is called. But I don’t get when the build
method is called, more specifically, how is input_shape
in build method specified? What is the input_shape
here? I don’t know when the build
method is called so I don’t know what arguments are put in as input_shape
argument.
Besides, I want to specify a parameter with a fixed size, which is (1,768) in my case. So in this case, should I still use input_shape
in build method?
The second question is that, should I consider the batch dimension in the call
method? For example, my input is a 3d array with the first dimension being sample size or batch size, let’s say the input is a 3d array of shape (1000,10,5), which means there are 1000 samples. If I want to transpose the input matrix, i.e. from shape (10,5) to (5,10), should I use tf.transpose(x, perm=[1, 0])
, which does not consider the batch dimension, or should I use tf.transpose(x, perm=[0, 2, 1])
, which takes consideration of the batch dimension? Besides, I want to do matrix multiplication in the call method (that’s why I transpose the inputs), should I consider batch dimension in tf.matmul
?