Call model inference in C/C++ from inputs, allocated in GPU memory

Yury_Lysogorskiy · December 1, 2023, 7:44am

Dear all,

I’d like to use TF-model in scientific simulation code, written in C++. This code allow to run simulation on GPU, so all necessary input data could be already placed on GPU.

In order to call TF model, I’m planning to use TF_NewTensor

Now my question is: Is it possible to control, where to place TF_Tensor? Can I just wraped it around existing on-GPU array to avoid CPU-to-GPU- memory transfer?

Thank you very much in advance!

Aniket_Dubey · October 1, 2024, 6:58am

Hi @Yury_Lysogorskiy ,

TF_NewTensor doesn’t allow direct control over tensor placement (CPU vs GPU).
Standard TensorFlow C API doesn’t provide a way to wrap existing GPU arrays without memory transfer.

Possible alternatives:

Create a TensorFlow custom device
Use CUDA-aware TensorFlow builds for CUDA integration
Develop a custom TensorFlow operation

For your use case, CUDA integration or custom ops might be the most promising approaches to
interface your GPU-resident data with TensorFlow.

Thank You .

Topic		Replies	Views
C api tensor pinned memory for faster GPU xfer? General Discussion tf-c-api , gpu , help_request	0	716	June 23, 2022
When I use numpy to process the tensor array generated by tensorflow, does it generate a new numpy array in memory or directly use the passed in tensorflow General Discussion api , keras , help_request , tensorflow	5	1356	June 29, 2022
Use cuCollections for Tensorflow model inference? General Discussion cuda , gpu , tf-model	2	213	March 19, 2024
Tensorflow in a CUDA Pipeline General Discussion models , cpp , gpu , help_request	0	765	August 10, 2021
Trying to understand Tensorflow Internals, would appreciate any insight on Tensor data General Discussion help_dev , education , tfcore	4	2358	July 5, 2021

Call model inference in C/C++ from inputs, allocated in GPU memory

Related topics