How to generate GPU kernel and execute it duing the HLO optimization pass

JueonPark · May 20, 2022, 10:38am

I am trying to do autotuning duing the graph optimization phase, inspired by A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers. However, I am having trouble generating GPU kernel and executing it with single HloInstruction. Seeing gemm_algorithm_picker.cc, I think it is possible to execute the kernel duing the graph optimization phase, but it is hard to find the way to do it.

My question is,

Is there a convenient way to generate a gpu kernel with single HloInstruction?
Aside from ExecuteKernelOnStream, is there easier way to run the kernel?
On what abstraction does the stream executor run the kernel?

Thank you!

Topic		Replies	Views
MLIR Code Generation for XLA General Discussion gpu , xla , mlir	5	2903	September 2, 2022
Xla (jit_compile flag) and gpu memory usage General Discussion gpu , xla	7	2672	October 23, 2021
How to convert XLA HLO debug code to executable? General Discussion models , xla , help_request	0	1306	April 7, 2022
Visualize TensorFlow graphs before and after Grappler passes? General Discussion xla , help_request	4	1303	March 14, 2023
How to get LLVM IR from XLA tfcompile General Discussion models , xla , help_request	15	2850	May 24, 2022

How to generate GPU kernel and execute it duing the HLO optimization pass

Related topics