I am using the few_od_shot_training example to try detect multiple object in a picture (Google Colab).
I am using the default rubber ducks in addition to a image containing 6 rubber ducks. After I have labeled all the ducks, converted it into a TFRecord and then loaded the TFRecord file into the application I try to train the model.
However, if I don’t add the image with 6 ducks and only stick to 1 duck per image the training works. But if I add the image with 6 ducks I get the following error:
./custom_training.py:225 train_step_fn *
losses_dict = model.loss(prediction_dict, shapes)
/home/hoster/.local/lib/python3.8/site-packages/object_detection/meta_architectures/ssd_meta_arch.py:842 loss *
(batch_cls_targets, batch_cls_weights, batch_reg_targets,
/home/hoster/.local/lib/python3.8/site-packages/object_detection/meta_architectures/ssd_meta_arch.py:1044 _assign_targets *
groundtruth_boxlists = [
/home/hoster/.local/lib/python3.8/site-packages/object_detection/core/box_list.py:56 __init__ **
raise ValueError('Invalid dimensions for box data: {}'.format(
ValueError: Invalid dimensions for box data: (1, 6, 4)
Here is the labels and bounding box lists:
Labels: [<tf.Tensor: shape=(1, 6), dtype=float32, numpy=array([[1., 0., 0., 0., 0., 0.]], dtype=float32)>, <tf.Tensor: shape=(1, 6), dtype=float32, numpy=array([[1., 0., 0., 0., 0., 0.]], dtype=float32)>, <tf.Tensor: shape=(1, 6), dtype=float32, numpy=array([[1., 1., 1., 1., 1., 1.]], dtype=float32)>]
Bounding boxes:` [<tf.Tensor 'groundtruth_boxes_list:0' shape=(1, 6, 4) dtype=float32>, <tf.Tensor 'groundtruth_boxes_list_1:0' shape=(1, 6, 4) dtype=float32>, <tf.Tensor 'groundtruth_boxes_list_2:0' shape=(1, 6, 4) dtype=float32>]`
How can I make the model accept multiple labels in a image?
I am using the EfficientDet-D2 model.
Thanks for any help!