I’m trying to build a very basic model for Chess. What I’m doing is simply use the mnist tutorial and adapt it for this. I have a large collection of games with the end result for each game (win, loss, or draw). What I have done is take the final board position of each game and treat it as a 8x8 data element (empty square is 0 and each piece gets a different number). There are only 3 labels - 0, 1, and 2 (for loss, win, and draw). I have used about 150000 positions for this test.

The accuracy is quite low. It gets up to 0.58 using 5 epochs. Using the same model with mnist data gets close to 100% accuracy). Are there any issues with this approach? Is there something I’m missing? Any thoughts or advice is greatly appreciated.

Your task is not the same as classifying MNIST images. Images of the same class look similar, while there are numerous quite different board positions, which can be labeled as win or loss.
If your input data is 2-dimensional, then probably it would be better to make each sample in the shape of n_squares x n_figure_types. Imagine that you are one-hot encoding chess pieces for each of the 64 squares of the board.
Besides, you should somehow distinguish black and white figures. Either the input data should be 3D (square, figure type, color) or the number of categories in OHE should be twice the number of types of chess pieces.
Obviously, you should try to use some layers that can make sence of the relative positions of the pieces on the board instead of flattening the input data.

Thank you for your response Ekaterina! Sorry I didn’t get a chance to come back to this till now. My samples are exactly in the shape you described above (8x8x12). I used OHE for twice the number of pieces to accommodate for the 2 colored pieces.

Can you please give more pointers on “use some layers that can make sense of the relative positions of the pieces on the board instead of flattening the input data?” I’m sure this is my key problem. Any examples would be greatly useful. Are there any existing models that I could reuse at all?

The other two links/papers are not quite useful for my problem since I’m not really trying to recognize chess images.

I don’t know if this approach will work or not. My idea was that when we deal with images, the data comes in shape (height, width, n_channels). Here we have data in shape (8, 8, n_classes). Probably, you can try some transformer based models similar to those that are applied for image classification but adjust them to your problem. You can find some examples of image transformers at keras.io.
Other possible solution could be to flatten the input data as if you deal with a 64 word-long sentence. Use ordinal encoder for classes in this case. So you will have a sequence of 64 digits and will be able to use models similar to text classifiers.