To get good quality language-agnostic sentence embeddings, LaBSE is a good choice. But due to the parameter size(Bert-base size, but #param is 471M), it is hard to fine-tune/deploy appropriately in a small GPU/machine.
So I applied the method of the paper “Load What You Need: Smaller Versions of Multilingual BERT” to get the smaller version of LaBSE, and I can reduce LaBSE’s parameters to 47% without a big performance drop using TF-hub and tensorflow/models.
The preprocessing model is exported using the modified vocab file. So this model can be used with updated preprocessing model, not the original one. (You can check here make_smaller_labse.py#L37)
And I didn’t think about publishing this model, because I didn’t train the model, just patched it. Is it okay to publish?