If you are into self-supervised learning, you already know that “representation collapse” is a real problem and is difficult to get around with simple methods. But not anymore!
Barlow Twins introduces a simple training objective that implicitly guarantees representation collapse does not happen.
Here’s my TensorFlow implementation:
With a ResNet20 as a trunk and a 3-layer MLP (each layer containing 2048 units) and 100 epochs of pre-training, I got 62.61% accuracy on the CIFAR10 test set. The pre-training total takes ~23 minutes on a single Tesla V100. Note that this pre-training does not make use of any labeled samples.
There’s a Colab Notebook inside. So, feel free to tweak it, experiment with it, and let me know. Happy to address any feedback.