I am building a C++ application including Tensorflow 2.6 in the aim to do classification or detection.
I managed to install the C++ API from source using Bazel.
I then started with classification. I managed to do training and prediction. It can tells if it is a dog or a cat most of the time which was already a great achievement starting from 0.
My problem now is to save and load this model.
I have tried to used WriteBinaryProto and ReadBinaryProto on .pb file but from my understanding, it only save the “architecture” of the model, like his composition ?
I have read about stuff concerning freezing a model to save the weights or I don’t really know what precisely… The trained part I assume. (If someone can clarify it would be appreciated indeed).
But freezing model seems to be in the past, at least for TF2 with python.
So I am not sure, is freezing a model still the way in TF2 using C++ API ? If not, can someone describe what i should do or at least give me ideas/explanations on how it works now ? I am a bit dry on this one.
I also read about checkpoint or something like that but did not manage to grasp it and use it.
To conclude, I also saw stuff about tensorflow::ops::Restore and tensorflow::ops::Save. But again no example and I am having trouble to make it works.
In the end, I find myself with 3 ideas but nothing that I managed to use haha.
I need to do as much as possible in C++ because we need a compiled solution so end-user can’t play with code.
I have found example of CNN trained and saved with the C++ API in TF1.x using frozen graph and checkpoint. From my understanding, frozen graph are not the way in TF2.x but checkpoint might be. Anyway, I don’t see why the possibility to save model and parameters would have been totaly removed. There has to be a solution.
Yes I know that ! Except it is… I manage this morning by cheating a bit eheh.
As i said, i read about freezing model in previous version of TF1.X. This is not working anymore as it appears it is not even included in the “build from source” way. BUT I have changed a bit the header and the corresponding source file I found in the git repository (seems like they are not built but still there in git repo TF2.x) and instead of trying to build them from source… I have added those 2 directly to my C++ project.
Surprise surprise, it works. I can do train, save, load and inferences without any trouble using only C++.
Still, as a developer, it feels a lot like cheating and that can’t be a good practice… Like no way wtf. I can’t believe they removed such important feature that was working. And certainly not without allowing to do it an other way… Would be very odd.
Of course, I am still listening to any proposition/solution that could load and save using TF2.x C++ library without tricks !
Anyway, thank for your help Bhack. Was not what I was looking for in this case but it is very nice of you to give me other possibilities. Oh and… Yes, obfuscating is a possibility but we are not very confident in the security it provide…
If you can load a Saved Model and run a particular signature, that should be all you need to make this work in a less hacky way. Follow that “On-Device training” tutorial, and just skip the “convert to tensorflow-lite” part. In python you build a model with signatures like “initialize”, “train_step”, “save”, “load”, “inference” and then in your target environment you call those as needed.
From what I am seeing in the tutorial, it is using checkpoint. It is a solution i tried without success… I did not manage to make it with C++.
I found something like that during my research which seems pretty close to the tutorial…
saver_def.filename_tensor_name is supposed to be the name of the tensor you must feed with a filename when saving/restoring. saver_def.restore_op_name is supposed to be the name of the target operation you must run when restoring. saver_def.save_tensor_name is supposed to be the name of the target operation you must run when saving.
But something was not working, no .ckpt files were created. Maybe I should try to replace the op_name with the signatures you suggested… I don’t know because I did not found any tips on this matter. I will try but with little hope haha.
Dear markdaoust,
I am trying to implement on-device training by invoking my train.tflite using tensorflowlite_jni.so.
I added these two libraries in the CMakeLists.txt file:
add_library( tensorflowlite_jni SHARED IMPORTED )
set_target_properties( tensorflowlite_jni PROPERTIES IMPORTED_LOCATION ${JNI_DIR}/${ANDROID_ABI}/libtensorflowlite_jni.so )
I used the following command to invoke the train signature in my train.tflite file:
TfLiteSignatureRunnerInvoke(train_model_info.signature_info[2].runner);
However, I encountered the following error:
Select TensorFlow op(s), included in the given model, is(are) not supported by this interpreter.
Make sure you apply/link the Flex delegate before inference.
For the Android, it can be resolved by adding “org.tensorflow:tensorflow-lite-select-tf-ops” dependency.
Node number 1409 (FlexBroadcastGradientArgs) failed to prepare.
I am unable to use libtensorflowlite_flex_jni.so to support this operation.
If I directly remove libtensorflowlite_jni.so from my project, I encounter the following error:
undefined reference to `TfLiteSignatureRunnerInvoke’
I would like to know how to run these two libraries together if I want to use the C API for on-device training. How can I make libtensorflowlite_flex_jni.so support the operations that libtensorflowlite_jni.so does not support?