I’m working on my first non-trivial tensorflow project (predicting moves-to-mate for chess endgames) and I’ve run across some odd behavior in the training stage. In multiple models I have trained on my data, the network looks like it has forgotten everything and restarts from scratch. How do I avoid this?

Training data is 40,000 samples, Adam optimizer, batch size 2000

I know my model is way over-parameterized at the moment, but the smaller models I tried weren’t accurate enough

.

def create_model_eg_bin2c(my_learning_rate):

“”“Create and compile a deep neural net.”“”

# This is a first try to get a simple model that works

model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Conv2D(

filters=64, kernel_size=(3,3), input_shape=(8,8,15), strides=(1, 1), padding=‘same’))

model.add(tf.keras.layers.MaxPooling2D((2, 2)))

model.add(tf.keras.layers.Conv2D(

filters=32, kernel_size=(3,3), strides=(1, 1), padding=‘same’))

model.add(tf.keras.layers.MaxPooling2D((2, 2)))

model.add(tf.keras.layers.Flatten())

model.add(tf.keras.layers.Dense(units=33))

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=my_learning_rate),

loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),

metrics=[‘accuracy’])

```
return model
```