I am trying to profile the ResNet50 model from Keras using Tensorboard. I am able to visualize the graph, but TensorBoard does not allow me to click on “compute time”.
I previously used the TF Estimator API and was able to get compute times using a custom profiler hook. Can you help me get a similar setup in Keras?
This is my code:
import numpy as np
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input, decode_predictions
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.callbacks import TensorBoard
model = ResNet50(weights='imagenet')
batch_size = 16
input = np.random.randint(0, 256, size=(batch_size, 224, 224, 3)).astype('float32')
output = np.random.randint(0, 1000, size=(batch_size, 1000))
model.compile(optimizer=SGD(learning_rate=0.0001, momentum=0.9), loss='categorical_crossentropy')
model.fit(input, output, epochs=1, callbacks=[
TensorBoard(log_dir='./logs', histogram_freq=1, profile_batch=1)
])
Hi richard_wwu.
If I understand well what you’re willing to do, you could write your own callback, overriding the methods that best suit your requirements.
Thanks for your reply.
Yes, I think I want to implement some kind of callback, but Keras’ callback interface seems to give me access only to the metrics, but not to the compute times.
I also tried following the steps exactly as described here:
but there is no “Profile” tab when I open TensorBoard and load the generated logs.
This is the console output:
2023-02-08 13:47:12.996073: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-08 13:48:01.640024: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-08 13:48:10.043065: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14611 MB memory: -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:8a:00.0, compute capability: 7.0
2023-02-08 13:48:14.065682: I tensorflow/core/profiler/lib/profiler_session.cc:101] Profiler session initializing.
2023-02-08 13:48:14.065753: I tensorflow/core/profiler/lib/profiler_session.cc:116] Profiler session started.
2023-02-08 13:48:14.065909: I tensorflow/core/profiler/backends/gpu/cupti_tracer.cc:1664] Profiler found 1 GPUs
2023-02-08 13:48:15.063702: I tensorflow/core/profiler/lib/profiler_session.cc:128] Profiler session tear down.
2023-02-08 13:48:15.067027: I tensorflow/core/profiler/backends/gpu/cupti_tracer.cc:1798] CUPTI activity buffer flushed
2023-02-08 13:48:15.911886: I tensorflow/core/profiler/lib/profiler_session.cc:101] Profiler session initializing.
2023-02-08 13:48:15.911959: I tensorflow/core/profiler/lib/profiler_session.cc:116] Profiler session started.
2023-02-08 13:48:36.286849: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:428] Loaded cuDNN version 8100
2023-02-08 13:48:48.905045: I tensorflow/compiler/xla/service/service.cc:173] XLA service 0x2ad864029aa0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-02-08 13:48:48.905152: I tensorflow/compiler/xla/service/service.cc:181] StreamExecutor device (0): Tesla V100-SXM2-16GB, Compute Capability 7.0
2023-02-08 13:48:50.809879: I tensorflow/compiler/jit/xla_compilation_cache.cc:477] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
2023-02-08 13:49:02.076718: I tensorflow/core/profiler/lib/profiler_session.cc:67] Profiler session collecting data.
2023-02-08 13:49:02.115324: I tensorflow/core/profiler/backends/gpu/cupti_tracer.cc:1798] CUPTI activity buffer flushed
2023-02-08 13:49:02.229502: I tensorflow/core/profiler/backends/gpu/cupti_collector.cc:522] GpuTracer has collected 14784 callback api events and 12368 activity events.
2023-02-08 13:49:02.303411: I tensorflow/core/profiler/lib/profiler_session.cc:128] Profiler session tear down.
2023-02-08 13:49:02.306547: I tensorflow/core/profiler/rpc/client/save_profile.cc:164] Collecting XSpace to repository: ./logs/plugins/profile/2023_02_08_13_49_02/r12n65.palma.wwu.xplane.pb
1/1 [==============================] - 51s 51s/step - loss: 3838640.7500
I was able to get the Profile tab to show up. When instantiating the TensorBoard callback, I passed profile_batch=1, but I am only passing data for a single batch. Since indices start at 0, I have to pass profile_batch=0 to profile the first batch.