Profiling Multi-Process TF Sessions

We have a multi-process application built upon Tensorflow. Separate sessions are launched from separate processes. Each session executes a graph composed of custom OPs representing steps of an image processing algorithm. We would like to profile our application. Specifically, we would like to analyze the performance of each OP and each session, and would like to see if OPs of different sessions overlap in the overall execution. Is this something that can be accomplished by Tensorboard or do we have to rely on Nsight System from Nvidia?

Hi @dl_xiaocaiji ,

Here are some Suggestion to analyses the use case

  • Start with TensorBoard for TensorFlow-specific insights
  • Use Nsight Systems for more comprehensive system and GPU analysis
  • Consider combining both tools for a complete picture
  • Implement custom profiling if needed, especially for custom ops.

Additionally you can explore this documentations , for better understanding .

Thank You