I have a dataframe in my ML project it is in the following format initially.
Feature 1 | Feature 2| Feature 3 | Feature 4 | etc.
Time 1
Time 2
Time 3
Etc.
I am trying to change this dataframe to be 3d, where each value in this dataframe has another dimension into the screen, containing the same value for the same feature, but at previous 192 timesteps.
Here i am trying to use the built in function keras.preprocessing.timeseries_dataset_from_array(), but it returns the opposite of what i’m trying to achieve.
I expect it to return
Feature 1 | Feature 2| Feature 3 | Feature 4 | etc.
Time 192| [1-192] | [1-192] | [1-192] | |
Time 193| | | | |
Time 194| | | | |
Time End| | | | |
Here it instead returns:
Feature 1 | Feature 2| Feature 3 | Feature 4 | etc.
Time 1| [192-1] | [192-1] | [192-1] | |
Time 2| | | | |
Time 3| | | | |
Time End-192| | | | |
Basically every sample contains the future 192 values, instead of the previous 192 values of the dataset. Therefore it ends 192 samples before it should, and starts 192 samples too early.
My code is the following:
#Past is defined as 192
#x_val is the 2-d dataframe
#y_val is one of the columns in the dataframe.
dataset_historic_train = keras.preprocessing.timeseries_dataset_from_array(
x_val,
y_val,
sequence_length=past,
batch_size=len(x_val),
)
Where x_val is the entirety of my 2-d dataframe indexed from first to last time of sample, and y_val is my target feature, which is Feature 1 in this case.