Hi,
I would like some clarification on how/when/why tensors should be allocated when implementing plugin kernels.
The API provided has:
TF_NewTensor
TF_AllocateTensor
TF_AllocateOutput
TF_SetOutput
TF_ForwardInputOrAllocateOutput
It seems that TF_ForwardInputOrAllocateOutput
is the way to handle all needs, however I’ve seen it constantly allocating instead of forwarding in cases where it was trivially “forwardable” (input → Reshape → Dense: the Reshape can forward instead of allocating).
What are the general guidelines for kernel implementations? What is possible/not possible/etc?
Should all kernels allocate a new output tensor? Is it allowed for kernels to reuse the input tensor by just using TF_SetOutput
on it? How does TF_ForwardInputOrAllocateOutput
decides when to forward and when to allocate? Why don’t I see it forwarding the input on a simple example like the one I mentioned?