So far I was using a Keras ImageDataGenerator
with flow_from_directory()
to train my Keras model with all images from the image class input folders. Now I want to train on multiple GPUs, so it seems I need to use a TensorFlow Dataset
object.
Thus I came up with this solution:
keras_model = build_model()
train_datagen = ImageDataGenerator()
training_img_generator = train_datagen.flow_from_directory(
input_path,
target_size=(image_size, image_size),
batch_size=batch_size,
class_mode="categorical",
)
train_dataset = tf.data.Dataset.from_generator(
lambda: training_img_generator,
output_types=(tf.float32, tf.float32),
output_shapes=([None, image_size, image_size, 3], [None, len(image_classes)])
)
# similar for validation_dataset = ...
keras_model.fit(
train_dataset,
steps_per_epoch=train_steps_per_epoch,
epochs=epoch_count,
validation_data=validation_dataset,
validation_steps=validation_steps_per_epoch,
)
Now this seem to work, the model is trained as usual. However, during training I get the following warning message, when using a mirrored strategy:
AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Did not find a shardable source, walked to a node which is not a dataset
So I added the following lines between creating the data sets and calling fit()
:
options = tf.data.Options()
options.experimental_distribute.auto_shard_policy = tf.data.experimental.AutoShardPolicy.DATA
train_dataset.with_options(options)
validation_dataset.with_options(options)
However, I still get the same warning.
This leads me to these two questions:
- What do I need to do in order to get rid of this warning?
- Even more important: Why is TF not able to split the dataset with the default
AutoShardPolicy.FILE
policy, since I am using thousands of images per class in the input folder?