Hello, I have been stuck with my project more than 1 month. I hope I can receive guides from experienced people.
I. My hardware
CPU: Intel Xeon E5 - 2678 v3 2.50GHz
Ram: 256GB
GPU: 2 x GPU GeForce 2080Ti 11GB
II. Overview model
x1 x2
backbone backbone
--------- concatenation --------
|
Fully Conected Layer
|
output
III. My code
Data preparation:
class Data:
data = []
init = False
datagen = ImageDataGenerator(rescale=1./255.)
#initize
def __init__(self, path, img_size = (320, 320)):
all_file = os.listdir(path) #take all couple files
#load couple images
data1 = []
data2 = []
label = []
for i in all_file:
#take couple path
if platform.system() == 'Darwin' and i.startswith('.'):
continue
temp_path = os.listdir(path + '/' + i)
temp_path.pop(temp_path.index('label.txt'))
f = open(path +'/' + i + '/label.txt', "r")
label.append(int(f.read()))
data1.append(cv2.resize(cv2.imread(path +'/' + i + '/' + temp_path[0]),img_size))
data2.append(cv2.resize(cv2.imread(path +'/' + i + '/' + temp_path[1]),img_size))
self.data = np.array([data1, data2])
self.label = np.array(label)
self.init = True
def load_data_generator(self, b_size):
if not self.init :
raise Exception('Data need to be initialized first')
# print(np.shape(self.data))
# generator = self.datagen.flow(x = part_data,y = part_label, batch_size=8)
genX1 = self.datagen.flow(x = self.data[0],
y = self.label,
batch_size = b_size,
shuffle=False,
seed=7)
genX2 = self.datagen.flow(x = self.data[1],
y = self.label,
batch_size = b_size,
shuffle=False,
seed=7)
while True:
X1i = genX1.next()
X2i = genX2.next()
yield ([X1i[0], X2i[0]], X2i[1])
I have experienced with ImageDataGenerator with 1 input. However, with 2 inputs I still confused how to prepare data for this model. I hope that I am received some advices for this problem.
Model
My python: 3.11.4
My tensorflow: 2.13 (WSL: Ubuntu)
I installed Tensorflow following to https://www.tensorflow.org/install/pip
My code of model:
if __name__ == '__main__':
print(tf.__version__)
strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
with strategy.scope():
resnet_1 = ResNet101(input_shape = (320, 320, 3),
include_top = False,
weights = None)
resnet_2 = ResNet101(input_shape = (320, 320, 3),
include_top = False,
weights = None)
x = resnet_1.layers[-2].output
y = resnet_2.layers[-2].output
#fix duplicate name
for layer in resnet_1.layers :
layer._name = layer.name + str('_1')
for layer in resnet_2.layers :
layer._name = layer.name + str('_2')
# combine the output of the two branches
combined = concatenate([x, y])
# apply a FC layer and then a regression prediction on the
# combined outputs
z = Flatten()(combined)
z = Dense(8, activation="relu")(z)
z = Dense(1, activation="sigmoid")(z)
# our model will accept the inputs of the two branches and
# then output a single value
model = Model(inputs=[resnet_1.input, resnet_2.input], outputs=z)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(), optimizer='adam')
tmp = data.load_data_generator(100)
model.fit(data.load_data_generator(100), batch_size = 16,
epochs=20)
And It thrown erorrs:
3 root error(s) found.
(0) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[50,256,80,80] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model/conv2_block3_3_conv_2/Conv2D}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
[[update_0/AssignAddVariableOp/_927]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
[[div_no_nan/ReadVariableOp_3/_912]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[50,256,80,80] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model/conv2_block3_3_conv_2/Conv2D}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
[[update_0/AssignAddVariableOp/_927]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(2) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[50,256,80,80] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model/conv2_block3_3_conv_2/Conv2D}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_217696]
2023-09-14 01:54:14.566063: W tensorflow/core/kernels/data/generator_dataset_op.cc:108] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.
[[{{node PyFunc}}]]
I hope I can receive help. This project is important with me. Thank for reading my post