from multiprocessing import Process
processes = []
for _ in range(3):
p = Process(target=test_parallel, args=args, kwargs=kwargs)
p.start()
processes.append(p)
for process in processes:
process.join()
With as expected output
here1
here1
here1
here2
here2
here2
Issue
Both multiprocessing and joblib seem to have issues in combination with tensorflow therefore I’m looking for a Tensorflow alternative or another solution.
more extensive explanation
I have a few objects of a class with a tensorflow Model as property “self.model”. Assuming I want to perform a certain function, “simulate” on each object, this would result in something like:
using Joblib
raises an error because obj is not pickle-able and in the args
Parallel(n_jobs=-1)(delayed(obj.simulate)(obj, args, kwargs) for obj in list_of_objects)
Using multiprocessing
I’ve obtained deadlock situations when using tensorflow functionalities and using more than a single processor. More extensive explanation of the issue here;
Context / First attempt
I have a workflow where I use tensorflow functions and for example @tf.function functionality to speed up my matrix computations. Now I want to extend this workflow to parallel computation. A small example and (bad) first attempt using the “test_parallel” defined earlier:
strategy = tf.distribute.MirroredStrategy()
idx = [i for i in range(3)]
dataset = tf.data.Dataset.from_tensor_slices(idx)
dist_dataset = strategy.experimental_distribute_dataset(dataset)
with strategy.scope():
for x in dist_dataset:
strategy.run(test_parallel, args=(x, args, kwargs))
However the output here is of the form
here1
here2
here1
here2
here1
here2
instead of
here1
here1
here1
here2
here2
here2
and hence my attempt clearly failed.
Conclusion
In general I want to perform some multiprocessing on tasks utilizing tensorflow functionalities. I was wondering if Tensorflow offers something alike that I somehow looked over. Another solution would be to obtain a working version with joblib or the python multiprocessing libraries.
Thank you for working on tensorflow,
In this issue the mirror strategy in tensorflow works as sequentially, once for each element and processes one after other. The other example with python multiprocessing is very close to parallel processing.
For you to achieve this in tensorflow please configure multi gpu and multi workers for desired output. Here is the gist.