Intermediate results kept for longer than needed on device

slai-nick · November 10, 2022, 8:50am

Hi,

I am running a model on a pluggable device I am developing for a new inference device.
When tracing the calls to the pluggable device, I see tensorflow keeping layer intermediate results for longer than they need to be on the device, with bigger models ending up out of device memory.
And just to make it clear, I am talking about intermediate results that are only used once and never more.

Is it expected behvaviour?
Why doesn’t tensorflow free intermediate results as soon as they are used?

In the kernels compute functions, I tend to use TF_ForwardInputOrAllocateOutput whenever possible or TF_AllocateOutput otherwise.
Should I force using reusing the input by calling TF_SetOutput on the input TF_Tensor *?

Topic		Replies	Views
Input Forwarding not happening General Discussion pluggable_device , help_request	0	598	September 8, 2022
TF not making use of device_memory_usage General Discussion pluggable_device , help_request	5	953	November 3, 2022
After tflite invoke, the tflite model use a larger memory General Discussion models , tflite , help_request	2	1052	March 17, 2022
Clarification on when kernels should allocate new tensors, try to forward input tensors or straight up reuse them General Discussion pluggable_device , help_request	0	470	November 22, 2022
Should I call TF_AllocateOutput or TF_SetOutput and reuse input? General Discussion pluggable_device , help_request	1	426	August 6, 2024

Intermediate results kept for longer than needed on device

Related topics