How does TensorFlow take a model graph and turn it into code which runs on some backend? (GPU or CPU for example?)

Since I found this page: XLA architecture  |  OpenXLA Project about XLA, I’ve been curious about what happens before XLA in TensorFlow. Does XLA get its initial graph representation simply by traversing the graph made from TensorFlow Python objects? Additionally, is XLA always executed before any graph code runs on a backend?

As I understand it, when you specify a tf.function, python runs the function in python and ‘watches’ it to build a graph similar to how you would do with tf.Session. in v1. So in this session object, there’s a graph object which contains various operator, variable, and placeholder objects which form a graph which can be traversed. Is this where XLA starts? Are there any ‘modes’ of TensorFlow where XLA isn’t run?

I could check the source code, however TensorFlow is a huge codebase, so I thought I’d try and save myself some time and ask here first.

Poke? Anybody have any information about this?