My two develpment environments are 1: win10, tf.2.10-gpu native, 2: ubuntu22.04 tf.2.12-gpu. However, the symptoms are the same.
As I know, to make custom loss, we can use class or function. In addition, when using class, I can debug code inside custom loss function easy (eager mode) by using ‘tf.config.run_functions_eagerly(True)’, of course I set ‘False’ in training mode. However eventhough I use ‘tf.config.run_functions_eagerly(True)’, when using function I could not debug code inside custom loss (@tf.function still activate). So now, I am designing the custom function using class, and converting the class based code to function based loss function. Two types of the loss is designed as follows and addtional input is anchorboxes_xy designe by tf.tensor.
class ssd_loss(tf.keras.losses.Loss):
def init(self, anchorboxes_xy, name=“ssd_loss”):
super(ssd_loss, self).init(name=name)
self.anchorboxes_xy = anchorboxes_xy # external tf tensor
…def call(self, y_true, y_pred): y_true_cls_batch, y_true_boxes_batch = y_true[:, :, :1], y_true[:, :, 1:] y_pred_cls_batch, y_pred_boxes_batch = y_pred[:, :, :param['n_classes']], y_pred[:, :, param['n_classes']:] true_class = tf.zeros([param['batch_size'], param['n_anchors'], 1]) true_boxes = tf.zeros([param['batch_size'], param['n_anchors'], 4]) for i in range(param['batch_size']): # IoU between y_true_boxes_batch and self.anchorboxes_xy obj_loc_idx = tf.where(y_true_cls_batch[i] != 0)[:,0] labels = tf.gather(y_true_cls_batch[i], obj_loc_idx) boxes = tf.gather(y_true_boxes_batch[i], obj_loc_idx) iou = calc_iou_2D(boxes, self.anchorboxes_xy) ... # calc loss code ... total_loss = func(code...) return total_loss def get_config(self): return {'anchorboxes_xy': self.anchorboxes_xy} @classmethod def from_config(cls, config): return cls(**config)
def ssd_loss_func(anchorboxes_xy):
anchorboxes_xy = anchorboxes_xy # external tf tensor
…
def ssd_loss(y_true, y_pred): y_true_cls_batch, y_true_boxes_batch = y_true[:, :, :1], y_true[:, :, 1:] y_pred_cls_batch, y_pred_boxes_batch = y_pred[:, :, :param['n_classes']], y_pred[:, :, param['n_classes']:] true_class = tf.zeros([param['batch_size'], param['n_anchors'], 1]) true_boxes = tf.zeros([param['batch_size'], param['n_anchors'], 4]) for i in range(param['batch_size']): # IoU between y_true_boxes_batch and anchorboxes_xy obj_loc_idx = tf.where(y_true_cls_batch[i] != 0)[:,0] labels = tf.gather(y_true_cls_batch[i], obj_loc_idx) boxes = tf.gather(y_true_boxes_batch[i], obj_loc_idx) iou = calc_iou_2D(boxes, anchorboxes_xy) ... # calc loss code ... total_loss = func(code...) return total_loss return ssd_loss
And, in case of class based custom loss, when saving model, the following error appear. Because I should use “model.predict” using the reconstructed model, I have to do model.save(…)
→ model.save(param[‘results_dir’] + ‘model.h5’)
(Pdb) n
TypeError: Unable to serialize [[0. 0. 0.14285715 0.14285715]
[0.14285715 0. 0.2857143 0.14285715]…
[0.71428573 0.85714287 0.85714287 1. ]
[0.85714287 0.85714287 1. 1. ]] to JSON. Unrecognized type <class ‘tensorflow.python.framework.ops.EagerTensor’>.
The displayed tensor value is exactly same as external “anchorboxes_xy”.
Summary
- When using class based loss, debugging is ok, model.save() not working
- When using function based loss, debugging not working, model.save() working
I hope use class based loss function because of easy debugging. I’m not sure how to fix it.