Describe the bug
While using TensorFlow Object Detection API, I’m experiencing an issue with a pre-trained Mask R-CNN Inception ResNet V2 1024x1024 model. When attempting to fine-tune this model for my custom task, I receive an error regarding missing variables even though the specified checkpoint seems to contain the appropriate parameters for this model.
To Reproduce
Steps to reproduce the behavior:
- Download the pre-trained Mask R-CNN Inception ResNet V2 1024x1024 model from the TensorFlow Model Zoo.
- Set up a custom training pipeline configuration, specifying the path to the downloaded checkpoint in the fine_tune_checkpoint field.
- Run the model training script (model_main_tf2.py).
- The error appears indicating some variables from the checkpoint are not found in the model.
Traceback (most recent call last): File "/content/models/research/object_detection/model_main_tf2.py", line 114, in <module> tf.compat.v1.app.run() File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/platform/app.py", line 36, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308, in run _run_main(main, args) File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/content/models/research/object_detection/model_main_tf2.py", line 105, in main model_lib_v2.train_loop( File "/usr/local/lib/python3.10/dist-packages/object_detection/model_lib_v2.py", line 605, in train_loop load_fine_tune_checkpoint( File "/usr/local/lib/python3.10/dist-packages/object_detection/model_lib_v2.py", line 398, in load_fine_tune_checkpoint raise ValueError('Checkpoint version should be V2') ValueError: Checkpoint version should be V2
Expected behavior
I expect the model training to begin by loading weights from the specified pre-trained model. The error seems to suggest a mismatch between the model architecture defined in my pipeline and the architecture of the pre-trained model. Still, my pipeline configuration appears to be correctly set up for the Mask R-CNN Inception ResNet V2 1024x1024 model.
Desktop (please complete the following information):
- OS: MacOS 13.4 (22F66)
- Browser Safari
- Version 16.5 (18615.2.9.11.4)
N.B: I am using Google Colab Pro
Additional context
Upon inspecting the checkpoint file with inspect_checkpoint.py, it does appear to contain all the expected variables for a Mask R-CNN Inception ResNet V2 1024x1024 model. I also confirmed that the downloaded files include ckpt-0.index, ckpt-0.data-00000-of-00001, and checkpoint. Yet, the issue persists. Any guidance or solutions to this problem would be greatly appreciated.