Hello,
I want to run a 1D CNN on some time series data and have questions at several steps in the process. I include code and info on the data below, and am seeking any assistance with understanding how to set the shape of the input training data and the test data.
I also seek info on how to run the model and predict classes using the test data with 10 examples of each of the 10 classes and look at the predicted classes.
The code below runs, but it does not seem to work like I would expect, and I cannot interpret the results of predictions to tell if I have it configured property.
I have questions inserted at several steps below.
Any suggestions greatly appreciated.
First, some info on my data:
dftrainin and dftestin are the train and test data that come from .CSV files with
column 1 = sampleID (text) to identify the examples / rows
columns 2 to 128 = time series data (floating point numbers, scaled 0 to 100)
column 129 = labels with column name ch_id
the 10 class labels (ch_id) are coded 0 to 9
each of the time series examples (rows) belong to one of 10 classes (ch_id)
in the training data there are 200 rows which are 20 examples each of 10 classes
in the test data there are 100 rows which are 10 examples each of 10 classes
after import many tf, keras, and other libraries / utils
work on copy of input data
dftrain = dftrainin
dftest = dftestin
pop off the labels
train_labels = dftrain.pop(‘ch_id’)
test_labels = dftest.pop(‘ch_id’)
questions about the shape of input training data and the test data?
how to ID and format the 20 example rows for each of the 10 classes?
how to ID and format the 10 example rows for each of the 10 classes?
try using 3rd dim as classes
dftrain_rs = tf.reshape(dftrain, [20, 127, 10])
dftest_rs = tf.reshape(dftest, [10, 127, 10])
one hot encode the labels
train_hot = np_utils.to_categorical(train_labels)
test_hot = np_utils.to_categorical(test_labels)
set up the model
this section runs without error
num_classes = 10
model = Sequential([
layers.Conv1D(filters=64, kernel_size=8, activation=‘relu’, input_shape=(127, 10)),
layers.Conv1D(filters=64, kernel_size=8, activation=‘relu’),
layers.Dropout(0.5),
layers.MaxPooling1D(pool_size=2),
layers.Flatten(),
layers.Dense(96, activation=‘relu’),
layers.Dense(num_classes, activation=‘softmax’)
])
compile the model
this section runs without error
model.compile(loss=‘categorical_crossentropy’, optimizer=‘adam’, metrics=[‘accuracy’])
train the model
use 20% for validation via argument passed to model.fit()
this section runs without error, but the accuracy goes to 1 after 2 epochs
which seems too fast to get to 100% accuracy
epochs=20
history = model.fit(
dftrain_rs, train_hot,
validation_split=0.2,
epochs=epochs
)
Visualize training results
Create plots of loss and accuracy on the training and validation sets.
this section runs without errors, but the graphs don’t look like typical
training curves
acc = history.history[‘accuracy’]
val_acc = history.history[‘val_accuracy’]
loss = history.history[‘loss’]
val_loss = history.history[‘val_loss’]
epochs_range = range(epochs)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label=‘Training Accuracy’)
plt.plot(epochs_range, val_acc, label=‘Validation Accuracy’)
plt.legend(loc=‘lower right’)
plt.title(‘Training and Validation Accuracy’)
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label=‘Training Loss’)
plt.plot(epochs_range, val_loss, label=‘Validation Loss’)
plt.legend(loc=‘upper right’)
plt.title(‘Training and Validation Loss’)
plt.show()
run the model for the test data
how to test with 10 examples of each of 10 classes and get predicted classes?
test_predictions = model.predict(dftest_rs).flatten()
test_scores = tf.nn.softmax(test_predictions)
how to see the predicted classes?
print(test_scores)
print(np.argmax(test_scores))