Active Learning is an incremental process of learning. In this process, we initially annotate and train on a small subset of the unlabled data pool and then query the model for what data it would want to train on in the future. This is done iteratively till business metrics are met.
In my recent tutorial on Keras, I implement a ratio-based sampling technique and demonstrate its usefulness on a toy, IMDB dataset. Some salient features of this tutorial include:
- The active learning method of training achieved similar results as compared to a standard training loop while eliminating the unnecessary annotation of ~10,000 labels!
- This method of sampling slightly balances out the false positives and false negatives which is beneficial for businesses that require balanced performance for both labels.
This tutorial serves as a comprehensive introduction to active learning with plenty of resources for beginners. If you’re interested then check it out.
Review Classification using Active Learning