Majority Filtering

MD_SHARIFUL_ISLAM · March 27, 2024, 4:48am

I have some image data for three classes. There might be some mislabeled data also in that dataset. I want to filter out the mislabeled one as much as possible. I am thinking about applying Majority Filtering method. The algorithm is like:

I take a subset of the training data.
Train multiple classifiers with the same subset of data.
Predict on a different subset using all the classifiers.
If majority of the classifiers fail to predict a label correctly, I tag it as mislabel and don’t consider it in the next iteration.

Now, I am having trouble managing all those data. I used image_dataset_from_directory method to import all the images. But during the training period the data are shuffled. So, I can’t keep track of which are mislabeled or correctly labeled. Also, I am not sure how to eliminate the mislabeled ones in the next training loop.

Ajay_Krishna · March 27, 2024, 5:22am

Are you using jupyter notebook for this. I think I had the same issue while working with jupyter notebook. I can help you if that is the case.

Topic		Replies	Views
How to do Minority class sampling using tensorflow? General Discussion tfdata , help_request	1	1131	June 13, 2021
Having issue with tf.keras.preprocessing.image_dataset_from_directory General Discussion api , keras , help_request	5	2262	January 19, 2022
Train on large dataset General Discussion datasets	1	411	June 19, 2023
ImageDataGenerator - how can I see which images are misclassified? TensorFlow datasets , tfimage , help_request	1	807	February 16, 2024
Training weak learners in using video or image dataset on TensorFlow General Discussion datasets , help_request	1	572	January 12, 2024

Majority Filtering

Related topics