Timeseries training basic understanding

Hello,

I’d like some inputs to understand how should I train a timeserie and make predictions.

I have set a POC using some few sample values for training.
Training endup successfully, I mean no error on code.
But when I input a timestamp 1699483693760 to predict value from, I then get a prediction of -3613421056.
I understand I have trained over only very few sample values but no reason why it should predict negative value. (my input training contains values between 0…5000)

Here is the code I have made, if you can take a look and point the mistakes :

Maybe I should use a lstm as first layer in the model?
Maybe I should format dates in timeserie in a different format?
Maybe I should restart from the begining : reading some articles and watching videos on ML training?
What else would you suggest to me?

Thanks.

hi,

From what I understand, you have overfitted your model. You have trained your model using just one year’s data, which is not ideal. Try predicting with a timestamp closer to trained data (e.g., 1697783693760), and you will see the prediction matching your expectations (hence, overfitting). You, at the very least, need more (diverse) data.

I would:
a. choose a simple linear regression model or tree-based model etc., for simple time series forecasting; OR
b. add LSTM and tune hyperparameters such as the number of epochs, neurons in each layer, batching, etc.; (I see little use with this)
OR
c. generate more data using techniques such as bootstrapping
OR
d. Get more data

edit:

you could also use an existing model and train that with your limited data (TRANSFER LEARNING). search ‘huggingface’ / kaggle / github for these models

Thanks for suggestions. I’m now having troubles in the code !

I have added input data to train…And i’ve also added a lstm layer; but this is failing with the following message :

Error when checking target: expected dense_Dense1 to have 3 dimension(s). but got array with shape 30,1

This message is obviously clear… but I can’t figure out how to manage my code in order to send correct inputShape. :frowning:

The model looks like :

__________________________________________________________________________________________
Layer (type)                Input Shape               Output shape              Param #   
==========================================================================================
lstm_LSTM1 (LSTM)           [[null,30,1]]             [null,30,1]               12        
__________________________________________________________________________________________
dense_Dense1 (Dense)        [[null,30,1]]             [null,30,1]               2         
==========================================================================================
Total params: 14
Trainable params: 14
Non-trainable params: 0

And I can see the following shapes :

dataX.shape [ 1, 30, 1 ]
dataY.shape [ 30, 1 ]

Any suggestion ?

Hi @Math ,

for the small data set (30) and as a quick solution, I can recommend Timeseries Forecasting with TFDF SimpleML

By using LSTM Layers for the full data (~300k+) you’ll need to stride your timeseries into windows to predict the next value(s). Please find a very detailed tutorial about Data Windowing here (Python).

On the first view your input dimension is not matching and the LSTM only has 1 unit.
I’ve just uploaded a little playground at Glitch (JS) (with heavy heart, just a bunch of functions, uncommented). Feel free to look for the following function univariant_data (Strides a 1d array into windows [start, end, size] and labels [target_size])

  1. Normalize Timeseries
  2. Stride Timeseries into Windows (Data Windowing) [n, w]
  3. Expand Dimension to [n, w, 1]
  4. Add a tf.layers.inputLayer({inputShape: [w, 1]}) > [batch,w,1]

Let me know,
Dennis

1 Like

Thank you @Dennis,

I did my best to follow your suggestions on normalization, data windowing.
I have a result I can share here Glitch :・゚✧

But, to be honest, I’m not mastering it at all. I believe I understand the main tensor concept but I guess I’m failing in the implementation.

The code in my link does training until the end, and then predict is returning error.
Can I have some more help on reviewing my training code before I go further into my implementation ? I’m not confident on what I did so far and it does not make sense to move deeper if training is messy.

Thanks a lot.

Hi @Math,

inside the code, it looks like you’re passing the timestamp as Training Data into the LSTM model (they’re always unique until engineering ). Please only use the values for univariant timeseries and stride them into windows. Basically you take a history window of the values and try to predict the next value(s).

T = [1,2,3,4,5,6,7,8,9]

Windowing (2 window, 1 forecast)
X_1 = [1,2] predict y_1=[3]
X_2 = [2,3] predict y_2=[4]
X_3 = [3,4] predict y_3=[5]

X = [[1,2],
[2,3],
[3,4],
[4,5],
[5,6],
[6,7]]
Y = [3,4,5,6,7,8]

X_test = [7,8]
Y_test = [9]

Increase the LSTM Layer Units a bit:
tf.layers.lstm({ units: 128, "activation":"relu", "recurrentActivation": "sigmoid", inputShape: [sequenceLength, 1] });

For predictions, you’ll need to pass the same/last window input size into the model, which it was trained on. e.g (last 2 day data): model.predict([[7,8]]) >> 9

Please find a little code example here (Glitch).

Keep in mind that this is a univariant timeseries forecasting (1d array of values). For Multivariant Timeseries please have a look at this great Tutorial/Demo.

Keep me updated,
Dennis

FWIW It seems to me you’re putting a lot of effort to get something working something that is wrong by design. I mean given your dataset you can have absolutely NO confidence in “predictions”. And as you seem to be learning LSTM the posssible error/warning messages and NULL/NANs values you may get confuse you. Note I don’t mean to be offensive, I’m just suggesting you’re probably spending too much time on this.

Hi,
I don’t see any offense. That’s fair and I need to learn by mistakes.
My initial question was to point the mistakes, hence I got responses that I need to tackle in order to progress.

Thank you both Dennis and Tagoma, seems I have some additional learning before I got back with any other questions.