Predict method with transformer encoder decoder

I have fitted a model using a fairly standard transformer encoder-decoder architecture using keras_nlp layers. When instantiating the model, I provide input as ([input, target], offset_target). So, target is just a sequence that goes into decoder and offset_target is the target sequence offset by 1 step. This is a standard nlp seq2seq model.

How do I correctly call predict method on the fitted model, as I have no target sequences?

Hi @acraev ,

Here’s how you can approach this:

  1. Input preparation: For prediction, you only need to provide the encoder input. The decoder will generate the output sequence step by step.

  2. Using the predict method: You’ll need to modify how you call the predict method. Instead of providing both input and target, you’ll only provide the encoder input.

  3. Handling the decoder input: Your model should be set up to handle the absence of decoder input during inference. This is typically done by using a special start token and then feeding back the previous output as the next input.

  4. Generating sequences: If your model doesn’t automatically handle the sequence generation, you might need to implement a custom prediction loop.

  5. Using keras_nlp specific methods: If you’re using keras_nlp.models like TransformerEncoder and TransformerDecoder, they might provide built-in methods for generation. Check the documentation1 documentation2 for any generate or similar methods that handle the decoding process automatically.

Additionally you can also refer this Article
Hope this helps.

Thank you.