How can I use multiprocessing with a Keras Sequence as trainingsdata?
I tried just passing multiprocessing=true and numworkers > 1 but that doesnt work. Because of some errors.
What I want is the following:
Read huge Tfrecords Dataset on multiple gpu cores and perform the training on the gpu.
I can not use tf.data because that approach requires to read the whole trainingdata into memory.
So I have to pass a generator or a Keras.Sequence. Without multiprocessing I made both of those work.
Whenever I pass multiprocessing = true I get an Error to model.fit() I get an Error.
I also tried this approach Speed Up your Keras Sequence Pipeline | by Mudit Bachhawat | Medium which basically builds a generator from a Keras.Sequence running on different processes with shared memory. But here aswell I get an error, when calling process.start()
Is there a working example for a datapipeline that reads Batches from a tfrecords dataset in parallel?
An Ideal example would be on Ubuntu using a gpu enabled Tensorflow version, where I can determine how many cores are running the datapipeline.