Hi ,
I want to check whether XLA compilation can be performed while inferencing with loading saved graph model. what I am performing is applying @tf.function( jit_compile=True) while inferencing with loaded saved model graph. I am seeing performance improvement in throughput by 7x .
Is above doing is correct way or I am doing it wrong please suggest ?