The tf.keras.utils.timeseries_dataset_from_array
provides 3 examples, the 2nd of which is misleading and may lead to truncation of input data in the time series output.
The original example is:
# Example 2: Temporal regression.
# Consider an array data of scalar values, of shape (steps,).
# To generate a dataset that uses the past 10 timesteps to predict the next timestep, you would use:
input_data = data[:-10]
targets = data[10:]
dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
input_data, targets, sequence_length=10)
for batch in dataset:
inputs, targets = batch
assert np.array_equal(inputs[0], data[:10]) # First sequence: steps [0-9]
assert np.array_equal(targets[0], data[10]) # Corresponding target: step 10
break
it returns
Input:[[0 1 2 3 4 5 6 7 8 9]], target:[10]
Say we set data = tf.range(20)
in fact the steps that it generates is less than what it should have because the slicing of input_data is misleading. If it is to predict the next 1 step, the example should be:
data = tf.range(20)
input_data = data[:-1]
targets = data[10:]
dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
input_data, targets, sequence_length=10)
for batch in dataset:
inputs, targets = batch
assert np.array_equal(inputs[0], data[:10]) # First sequence: steps [0-9]
assert np.array_equal(targets[0], data[10]) # Corresponding target: step 10
break
for batch in dataset.as_numpy_iterator():
input, label = batch
print(f"Input:{input}, target:{label}")
It returns:
Input:[[ 0 1 2 3 4 5 6 7 8 9]
[ 1 2 3 4 5 6 7 8 9 10]
[ 2 3 4 5 6 7 8 9 10 11]
[ 3 4 5 6 7 8 9 10 11 12]
[ 4 5 6 7 8 9 10 11 12 13]
[ 5 6 7 8 9 10 11 12 13 14]
[ 6 7 8 9 10 11 12 13 14 15]
[ 7 8 9 10 11 12 13 14 15 16]
[ 8 9 10 11 12 13 14 15 16 17]
[ 9 10 11 12 13 14 15 16 17 18]], target:[10 11 12 13 14 15 16 17 18 19]