Inquiry about the TimeSeries WindowGenerator Class

Hi everyone,

I am studying the code for time series forecasting using TensorFlow, and I have encountered an issue I can’t quite understand. I’d appreciate any help with this.

When setting input_width=6, label_width=1, and shift=1, everything works as expected, as shown in Figure 1.

Similarly, with input_width=6, label_width=1, and shift=3, everything is fine, as shown in Figure 2.

However, when I set input_width=6, label_width=3, and shift=1, the labeling of the target values corresponding to the inputs seems incorrect, leading to data leakage. In this case, we are explicitly giving the information of indices 4 and 5 to the model, and the target values have already been seen by the model. Specifically, we provide the model with 6 timesteps of information (indices 0, 1, 2, 3, 4, 5) and expect it to forecast the next 3 values (indices 6, 7, 8).

In a worse scenario, if we set input_width=6, label_width=10, and shift=1, aiming to forecast the next 10 values, only indices 4, 5, 6 are given to the model as target labels. This is because the labeling starts from the index -3 up to the end. The correct labeling should assign indices 6, 7, …, 15 as target values.

Could anyone explain why this happens and how to correctly configure the WindowGenerator class to avoid this issue?

Thanks in advance!

The TimeSeries WindowGenerator class is a tool used in time series analysis to create sliding window datasets for training machine learning models. It helps in preparing data by splitting it into input-output pairs, allowing models to learn patterns over time.

Thanks Mark!
I know about its use cases. I think it doesn’t work correctly in all possible scenarios, especially in cases that we have assigned bigger than 1 values of label_width, as I described above.

The WindowGenerator class returns expected output when provided with “reasonable” input arguments. It is not made to cover “all possible scenarios”.

Exactly!
However, it is completely reasonable to expect your time-series model to be able to forecast 3 upcoming values, especially when it is being considered in the class. If it is supposed to work correctly only for a label_width of 1, why is this being considered as an attribute of the class?