We evaluate the performance of the original dataset to compare the performance with the augmented dataset.
To evaluate the performance of SVM on the original dataset, we will use the predict() function to predict the class labels of the test data and then use the accuracy_score() function from the scikit-learn library to calculate the accuracy of the classifier:
from sklearn.metrics import accuracy_score
# Predict the class labels of the test data
y_pred = clf.predict(x_test)
# Calculate the accuracy of the classifier
accuracy = accuracy_score(y_test, y_pred)
print(“Accuracy: %.2f%%” % (accuracy * 100.0))
The predict() function is used to predict the class labels of the test data. The accuracy_score() function is used to calculate the accuracy of the classifier by comparing the predicted class labels to the actual class labels.
The accuracy of the SVM model on the test dataset is around 47.97%, which is not very good. This indicates that the SVM model is not able to learn all the important features and patterns in the original dataset.
Implementing an SVM with an augmented dataset
To implement SVM with data augmentation, we will use the ImageDataGenerator class from the Keras library to generate new training data. We will first create an instance of the ImageDataGenerator class and then use the flow() function to generate new batches of training data:
from keras.preprocessing.image import ImageDataGenerator
# Create an instance of the ImageDataGenerator class
datagen = ImageDataGenerator(rotation_range=20, \
width_shift_range=0.1, height_shift_range=0.1, \
shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
# Generate new batches of training data
gen_train = datagen.flow(x_train, y_train, batch_size=64)
The ImageDataGenerator() function creates an instance of the ImageDataGenerator class. The rotation_range, width_shift_range, height_shift_range, shear_range, zoom_range, and horizontal_flip arguments are used to specify the types of data augmentation to be applied to the training data.
The flow() function is used to generate new batches of training data from the original training data and the ImageDataGenerator object.
Training the SVM on augmented data
To train SVM on augmented data, we will use the partial_fit() function of the SVC class to train the classifier on each batch of training data generated by the ImageDataGenerator object:
# Train the classifier on each batch of training data
for i in range(100):
x_batch, y_batch = gen_train.next()
clf.partial_fit(x_batch, y_batch, classes=np.unique(y_train))
The classes argument is used to specify the unique classes in the training data.