Predict method with transformer encoder decoder

acraev · May 31, 2023, 11:46pm

I have fitted a model using a fairly standard transformer encoder-decoder architecture using keras_nlp layers. When instantiating the model, I provide input as ([input, target], offset_target). So, target is just a sequence that goes into decoder and offset_target is the target sequence offset by 1 step. This is a standard nlp seq2seq model.

How do I correctly call predict method on the fitted model, as I have no target sequences?

Aniket_Dubey · July 9, 2024, 6:58am

Hi @acraev ,

Here’s how you can approach this:

Input preparation: For prediction, you only need to provide the encoder input. The decoder will generate the output sequence step by step.
Using the predict method: You’ll need to modify how you call the predict method. Instead of providing both input and target, you’ll only provide the encoder input.
Handling the decoder input: Your model should be set up to handle the absence of decoder input during inference. This is typically done by using a special start token and then feeding back the previous output as the next input.
Generating sequences: If your model doesn’t automatically handle the sequence generation, you might need to implement a custom prediction loop.
Using keras_nlp specific methods: If you’re using keras_nlp.models like TransformerEncoder and TransformerDecoder, they might provide built-in methods for generation. Check the documentation1 documentation2 for any generate or similar methods that handle the decoding process automatically.

Additionally you can also refer this Article
Hope this helps.

Thank you.

Topic		Replies	Views
Optimizing seq2seq decoding script Keras models , getting_started , education , help_request , transformers	1	497	July 25, 2025
How to jointly predict a sequence and its associated scoremo Keras models , keras , help_request	1	1466	January 18, 2024
I have been training a decoder based transformer for word generation. But it keeps generating the same words over and over again Keras api , help_request , transformers	1	652	December 20, 2024
Difficulties adapting text generation example to regression problem General Discussion models	0	270	September 20, 2023
Problem with a custom pre-processing layer for serving a NER model General Discussion models , help_request	1	293	January 2, 2023

Predict method with transformer encoder decoder

Related topics