I want to know an efficient way of serializing the data for training of neural network on a large dataset. All the posts related to this topic provide answers for audio or image datasets. I have also tried to find a solution in tensor-flow docs, but couldn’t find one.
To be very specific -
I am currently working on a chess engine which uses neural nets. I have a dataset of around 1 million games (each games contains an average of 50 moves). And, i am representing each board position(1 million * 50) in a array of 14x8x8 dimensions. I have two other variables for each position - A score corresponding to each board position (from stock-fish engine), and a int value which represents whose turn it is to play (1 for white and 0 for black).
Since it is not possible to load 50 million 14x8x8 arrays directly. I am looking for a way to load this data efficiently.