Hello, everybody!
I use ModifyGraphWithDelegate() function for acceleration with GPU. When enabling the application the first is it works fine, after disabling the application then is two issues happen:
+> Memory not deleted, I suspect that GPU memory not delete. But currently, I haven’t yet found any way to check that.
+> Enable the again application then the application hangs when the interpreter call Invoke() function.
Does anyone know why?
Thank a lot!
Below is a snippet code of my application:
int main()
{
unique_ptrtflite::FlatBufferModel m_model;
unique_ptrtflite::Interpreter interpreter;
string modelFileName = “Path to model.tflie”;
m_model = tflite::FlatBufferModel::BuildFromFile(modelFileName.c_str());
if (m_model == nullptr) {
fprintf(stderr, “Failed to load model\n”);
exit(EXIT_FAILURE);
TfLiteGpuDelegateOptionsV2 options = TfLiteGpuDelegateOptionsV2Default();
options.inference_priority1 = TFLITE_GPU_INFERENCE_PRIORITY_MIN_LATENCY;
options.inference_priority2 = TFLITE_GPU_INFERENCE_PRIORITY_AUTO;
options.inference_priority3 = TFLITE_GPU_INFERENCE_PRIORITY_AUTO;
options.experimental_flags |= TFLITE_GPU_EXPERIMENTAL_FLAGS_ENABLE_QUANT;
options.inference_preference = TFLITE_GPU_INFERENCE_PREFERENCE_FAST_SINGLE_ANSWER;
options.experimental_flags |= TFLITE_GPU_EXPERIMENTAL_FLAGS_CL_ONLY;
auto theGpuDelegate = tflite::Interpreter::TfLiteDelegatePtr(TfLiteGpuDelegateV2Create(&options), TfLiteGpuDelegateV2Delete);
tflite::ops::builtin::BuiltinOpResolver resolver;
tflite::InterpreterBuilder(*m_model.get(), resolver)(&interpreter);
if(interpreter->ModifyGraphWithDelegate(theGpuDelegate.get()) != kTfLiteOk) throw std::runtime_error(“Fail modify graph with GPU delegate”);
while(1) {
screenImage = cv::imread(imageFilePath);
memcpy(interpreter->typed_input_tensor<float>(0), screenImage.data, screenImage.total() * screenImage.elemSize());
interpreter->Invoke();
TfLiteTensor* scores = interpreter->output_tensor(0);
int rows = scores->dims->data[1];
int colums = scores->dims->data[2];
float* data_scores = new float[rows*colums];
float* data_geometry = new float[5*rows*colums];
memcpy(data_scores, interpreter->typed_output_tensor<float>(0), rows*colums*sizeof(float));
memcpy(data_geometry, interpreter->typed_output_tensor<float>(1), 5*rows*colums*sizeof(float));
// Decoding data
decode(data_scores, data_geometry, rows, colums);
delete[] data_scores;
delete[] data_geometry;
}
return 0;
};
Deployment environment:
- TF-Lite version: 2.65
- OS: ubuntu 18.04
- Chip: ARM-(GPU) Mali-G series.