I am trying to train a RNN Network for stock price prediction for my Master Thesis. I have additional input values (6), not just the stock prices by itself.
Using an LSTM Network with the “optimal” structure based on Hyperparameter Tuning with Keras Tuner, i observed a significant increase in the losses for training and validation in my case after 4000 Epochs.
My dataset consists of about 12 000 datapoints and i use the Adam optimizer with mean_absolute_error loss function
The Network is quite deep with the following layers:
I reduced the learning rate of the adam optimizer to 0.0002. Which resulted in the V2 results in the graph, which for me reduced but did not solve the issue.
Does anybody have an idea why? I am running the same network and data with the RMSProp optimizer right now. I will get the results tomorrow.
The standard interpretation is that your network is starting to overtrain after 4000/8000 epochs.
Do you have a separate holdout “test” set? The current recommended practice is to have three datasets: training, validation, and test. You run training with the training and validation datasets, and stop when validation starts to fail. Then run the optimally valid model on the holdout test set. The validation and test loss and accuracy should roughly match.
I do not have a test set, but can easily generate one. But what difference does it make to the validation set?
I would have expected to see overtraining as the training loss to stay low, but just the validation loss to increase.
If i am wrong, could you give me a rough explanation, i need it for my master thesis.
the summary is that if you don’t have a separate test set, your model will overfit and you won’t be able to verify that. In the end of the thread there’s also a link to this great video: https://www.youtube.com/watch?v=pGlQLMPI46g