Audio event classification with transfer learning
We are now ready to start working towards building our audio event classifier. We have our base feature maps, but we still need to do some more feature engineering. You can always build a CNN from scratch to ingest these images and then connect it to a fully connected deep multilayer perceptron (MLP) to build a classifier. However, we will be leveraging the power of transfer learning here by using one of the pretrained models for feature extraction. To be more specific, we will be using the VGG-16 model as a feature extractor and then train a fully-connected deep network on these features.
Building datasets from base features
The first step is to load our base features and create our train, validation and test datasets. For this, we will need to load our base features and labels from disk:
features = joblib.load('base_features.pkl') labels = joblib.load('dataset_labels.pkl') data = np.array(list(zip(features, labels))) features.shape, labels...