Writing a pluggable device, I have been using eager execution all along to test it and was able to run a few models including Resnet50.
Now that I am trying to use the graph mode, I get into this issue:
2022-08-08 09:51:26.143724: W tensorflow/core/grappler/utils/graph_view.cc:849] No registered '_FusedBatchNormEx' OpKernel for PPU devices compatible with node {{node resnet50/conv5_block3_2_bn/FusedBatchNormV3}}
. Registered: device='XLA_CPU_JIT'; U in [DT_FLOAT]; T in [DT_FLOAT, DT_BFLOAT16, DT_HALF]
device='CPU'; T in [DT_BFLOAT16]; U in [DT_FLOAT]
device='CPU'; T in [DT_FLOAT]; U in [DT_FLOAT]
(PPU is the name of my device).
This to me looks like some optimisations like fusion are done by grappler and my plugin doesn’t implement the resulting kernels.
A few questions come to me:
- What kernels should I implement, or what should I refer to to implement the kernels resulting from the grappler? In particular and as an example,
_FusedBatchNormEx
doesn’t seem to have any documentation. - How can I deactivate some optimisations in order to get it to run?
Referring to the tutorial on pluggable device, it seems that I can deactivate some optimisations inTF_InitGraph
, for exampleparams->optimizer_configs->remapping = TF_TriState_Off;
. However when implementingTF_InitGraph
I have to provide an optimizeroptimize
function, and at this stage of my development I don’t have any custom optimisation to provide and I don’t know how to write a dummy function that would be correct.
Thank you for your interest.