So, I am able to load my model (model.load_weights()) from both of these options without any error. Moreover, inference is fine. In short, everything works as I expect.

But if I start a new session and load my model again then inference is bad, like model has random weights instead of my saved weights.

I was trying other options of saving models as well, but they do not work also. Probably there is a special way to save a model with transformer layer?

infact your code is ok just follow my first post and you will get reproducible results anytime you run on colab, even if your session expired, even you dont want function you can just do it this way just after importing these two modules:

from tensorflow.python.framework import ops
import tensorflow as tf

see the problem is that tensorflow reinitialize variables when your session expires, so to solve this problem, you have to manually set your own random seed after importing tensorflow and numpy.

create a function this way:

from tensorflow.python.framework import ops
import tensorflow as tf
SEED=42 #choose any seed of your choice
def reproducibleResult(seed:int):
ops.reset_default_graph()
tf.random.set_seed(seed)
np.random.seed(seed)

#call your function and problem solved
reproducibleResult(SEED)