I need to implement layer drop in TensorFlow Transformer. Can some one guide me how to do that??
Reference paper: https://arxiv.org/pdf/1909.11556.pdf
Thanks!!
I need to implement layer drop in TensorFlow Transformer. Can some one guide me how to do that??
Reference paper: https://arxiv.org/pdf/1909.11556.pdf
Thanks!!
You can take a look at the HugginFace impl in different TF models.
E.g. TFBert
Hello @Bhack,
thanks for your reply. I have tried to implement this way, but this implementation is not working when decorating train-step with tf.function(...)
since new variables for few layers won’t get formed during 1st step & tf.function(...)
doesn’t allow us to form variables in further steps.
Just a side note: This solution is perfectly working in eager mode though.
Do you get an explicit error in graph mode (tf.function)?
yes. I will share that complete message here in a min.
here is the error message:
Traceback (most recent call last):
File "main.py", line 93, in <module>
main(args)
File "main.py", line 82, in main
verbose="auto",
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1183, in fit
tmp_logs = self.train_function(iterator)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 889, in __call__
result = self._call(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 950, in _call
return self._stateless_fn(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3022, in __call__
filtered_flat_args) = self._maybe_define_function(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3444, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3289, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 999, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 672, in wrapped_fn
out = weak_wrapped_fn().__wrapped__(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 986, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:855 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:845 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:1285 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2833 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3608 _call_for_each_replica
return fn(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:838 run_step **
outputs = model.train_step(data)
/content/gsoc-wav2vec2/src/wav2vec2/modeling.py:236 train_step
self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:636 apply_gradients
self._create_all_weights(var_list)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:823 _create_all_weights
self._create_slots(var_list)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/adam.py:124 _create_slots
self.add_slot(var, 'm')
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:913 add_slot
initial_value=initial_value)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/variables.py:262 __call__
return cls._variable_v2_call(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/variables.py:256 _variable_v2_call
shape=shape)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/variables.py:67 getter
return captured_getter(captured_previous, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3523 creator
return next_creator(**kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/variables.py:67 getter
return captured_getter(captured_previous, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3523 creator
return next_creator(**kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/variables.py:67 getter
return captured_getter(captured_previous, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3523 creator
return next_creator(**kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/variables.py:67 getter
return captured_getter(captured_previous, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py:769 invalid_creator_scope
"tf.function-decorated function tried to create "
ValueError: tf.function-decorated function tried to create variables on non-first call.
Probably you could have some impact with:
https://github.com/tensorflow/tensorflow/pull/49310
But I’ve not tested your specific case with this new flag.