I am trying to build a machine learning model (LSTM) with tensorflow which predicts a single number from a time series of numbers.
First of all, you can imagine my dataset to look something like this:
Index | time data | x data | y data |
---|---|---|---|
0 | np.ndarray(shape (1209278,) ) |
np.ndarray(shape (1209278,) ) |
numpy.float32 |
1 | np.ndarray(shape (1211140,) ) |
np.ndarray(shape (1211140,) ) |
numpy.float32 |
2 | np.ndarray(shape (1418411,) ) |
np.ndarray(shape (1418411,) ) |
numpy.float32 |
… | … | … | … |
Basically, I have time data and at eacht time step I have a corresponding x data point. For each time sequence I want to predict the corresponding single number found in y data.
Easily said I just want my model to predict a number from a time sequence of numbers.
For example like this:
- array([(time_step_1_1, x_val_1_1), (time_step_1_2, x_val_1_2), …]) => y_val_1
- array([(time_step_2_1, x_val_2_1), (time_step_2_2, x_val_2_2), …]) => y_val_2
- …
In this example x_val_1_1 means the first x value of the first sequence of data in my dataset, x_val_1_2 means the second x value of the first sequence and so on.
On the other hand, x_val_2_1 means the first value of the second sequence of data and so on, I think you get it.
It is important to notice, that my x data arrays are NOT of the same length (as you can see in the table above).
I also have a Google colab notebook with a minimal example, which would probably really helpful to understand what I want to do. It can be found down below.
In my current attempt I have used ragged tensors from tensorflow which seem to be a good choice because “They make it easy to store and process data with non-uniform shapes like: Batches of variable-length sequential inputs”.
Currently I did not use the time data yet, because it was possible by only using the x_data without paying respect to the time data.
But turns out: It is not. I was able to train my model on very powerful hardware, but the results looked like this:
In this graph the blue line is the one that’s the data I want to predict, so my y data. The orange line is the result my model currently actually predicts. So it seems like my model tries to find the best constant value to fit the curve rather than fitting the actual curve directly.
This fact combined with the problem that I had a lot of out of memory errors like this: OOM when allocating tensor with shape[22119477696] and type uint8 on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
although I was using very powerful hardware (also have a look at this part of the error message shape[22119477696]
, which clearly is an insanely big array).
Clearly something is wrong with my model, but I don’t know what and why.
All in all I hope that somebody with a bit more experience could know how to tackle this problem, unlike me. Thank you for your time. Your help is greatly appreciated.
Thanks in advance!