Hi everyone,
I am studying the code for time series forecasting using TensorFlow, and I have encountered an issue I can’t quite understand. I’d appreciate any help with this.
When setting input_width=6
, label_width=1
, and shift=1
, everything works as expected, as shown in Figure 1.
Similarly, with input_width=6
, label_width=1
, and shift=3
, everything is fine, as shown in Figure 2.
However, when I set input_width=6
, label_width=3
, and shift=1
, the labeling of the target values corresponding to the inputs seems incorrect, leading to data leakage. In this case, we are explicitly giving the information of indices 4 and 5 to the model, and the target values have already been seen by the model. Specifically, we provide the model with 6 timesteps of information (indices 0, 1, 2, 3, 4, 5) and expect it to forecast the next 3 values (indices 6, 7, 8).
In a worse scenario, if we set input_width=6
, label_width=10
, and shift=1
, aiming to forecast the next 10 values, only indices 4, 5, 6 are given to the model as target labels. This is because the labeling starts from the index -3 up to the end. The correct labeling should assign indices 6, 7, …, 15 as target values.
Could anyone explain why this happens and how to correctly configure the WindowGenerator
class to avoid this issue?
Thanks in advance!