How 'Windows' affect time series forecasting

I am trying to understand the code used in:

They define further below what is a ‘multi_window’ in which several inputs have a label of several outputs. The WindowGenerator class is defined as:

class WindowGenerator():
  def __init__(self, input_width, label_width, shift,
               train_df=train_df, val_df=val_df, test_df=test_df,
               label_columns=None):
    # Store the raw data.
    self.train_df = train_df
    self.val_df = val_df
    self.test_df = test_df

    # Work out the label column indices.
    self.label_columns = label_columns
    if label_columns is not None:
      self.label_columns_indices = {name: i for i, name in
                                    enumerate(label_columns)}
    self.column_indices = {name: i for i, name in
                           enumerate(train_df.columns)}

    # Work out the window parameters.
    self.input_width = input_width
    self.label_width = label_width
    self.shift = shift

    self.total_window_size = input_width + shift

    self.input_slice = slice(0, input_width)
    self.input_indices = np.arange(self.total_window_size)[self.input_slice]

    self.label_start = self.total_window_size - self.label_width
    self.labels_slice = slice(self.label_start, None)
    self.label_indices = np.arange(self.total_window_size)[self.labels_slice]

  def __repr__(self):
    return '\n'.join([
        f'Total window size: {self.total_window_size}',
        f'Input indices: {self.input_indices}',
        f'Label indices: {self.label_indices}',
        f'Label column name(s): {self.label_columns}'])

But this is then used as an input when calling the ‘model.fit’ function:

def compile_and_fit(model, window, patience=2):
  early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
                                                    patience=patience,
                                                    mode='min')

  model.compile(loss=tf.losses.MeanSquaredError(),
                optimizer=tf.optimizers.Adam(),
                metrics=[tf.metrics.MeanAbsoluteError()])

  history = model.fit(window.train, epochs=MAX_EPOCHS,
                      validation_data=window.val,
                      callbacks=[early_stopping])
  return history

You can see this in the initialization of ‘history’ where the training set of the raw data is returned.

From what I can make sense of, the input_width and label_width of the Window does not affect what goes into the LSTM at all, as the only thing that is being called is the validation and test sets.

Hi @Carl_Johnson,

Sorry for the delay in response.
The window parameters (input_width and label_width) are important for preparing the training data, even if they’re not directly used in the model.fit() call.It has been used for transforming the raw time series data into suitable format for in case of LSTM it expects input in the shape (batch_size, timesteps, features).

In this tutorial, we can see that the make_dataset() method and the timeseries_dataset_from_array() function create sequences based on the total window size. The split_window() method then explicitly separates these sequences, extracting input sequences of length input_width and label sequences of length label_width. When calling model.fit(), the training dataset passed to the model contains sequences prepared according to these window parameters.So, the windows would definitely affect the timeseries forecasting.

Hope this helps.Thank You.