Using MultiHeadAttention in custom layers

sbucur · April 17, 2023, 2:04am

The MultiHeadAttention documentation states:

When using MultiHeadAttention inside a custom layer, the custom layer must implement its own build() method and call MultiHeadAttention 's _build_from_signature() there.

Is this guidance up to date? I don’t see this advice implemented in any of the examples using this layer in the Tensorflow documentation, like this one.

If this is up to date, can anyone share an example? The signature of the build method is def build(self, input_shape), whereas we have def _build_from_signature(self, query, value, key=None) for the multi-head attention layer. How do I get values for query, value, and key, from input_shape?

Mark_Daoust · April 21, 2023, 3:56pm

Interesting. It’s possible that the examples skipping this step didn’t use the keras feature that required _build_from_signature. It could be keras saving that requires this and that tutorial only does a tf.saved_model export.

@bischof may be able to provide more information here.

Topic		Replies	Views
Implement MultiHeadAttention() into an simple Model General Discussion models , help_request	1	977	September 10, 2024
How to implement tf.keras.layers.MultiHeadAttention? Keras api , help_request	2	4589	September 29, 2022
Help needed with TimeDistributed MultiHeadAttention General Discussion api , keras	2	825	April 21, 2023
Creating my own layer in Keras General Discussion models , keras , help_request , custom-layer	1	666	September 15, 2022
`key_dim` in multihead attention layer General Discussion keras	1	1228	September 11, 2024

Using MultiHeadAttention in custom layers

Related topics