MedGemma finetuning - padding and labels' masking

SyrineM · September 18, 2025, 8:19am

Hi, I’ve got a few questions about the MedGemma finetuning collab :

1- In the collate function, when creating the labels tensor, only image tokens are masked and not the entire input tokens sequence. Meaning instead of training on the completion only, the model is finetuned on both the prompt and the answer. What is the technical reason behind this ?
2- In generation mode, the tokenizer’s padding side is set to the left. Why is there this change between training and inference, since padding tokens are masked in the attention mask anyway ? Should we do that for validation as well ?

Thank you for your help !

tiffanychen · September 21, 2025, 12:58am

1- The notebook follows the default behavior of the SFTTrainer where the model is trained on both the prompt and answers, which is a standard approach for language modeling. You could also decide to train on completions only by using a prompt-completion dataset. Note that as of TRL v0.22.0 there is native support for training vision-language models using both of these approaches without the need for a custom data collator. You can learn more in the SFT Trainer docs.

2- During inference, padding side needs to be set to left because the model isn’t trained to continue generating from pad tokens. During training (including validation during training), padding side should be set to right when using the SFTTrainer due to an observed issue (potential overflow when training a model in half precision resulting in zero loss).

Hope this helps!

Topic		Replies	Views
Gemma 3 support for packed sequence training with FlashAttention 2? Gemma models	2	209	February 26, 2026
Question on pre-training for medical tasks HAI-DEF model , medgemma	3	148	December 3, 2025
Question on image resolution scaling HAI-DEF medgemma	1	64	November 11, 2025
MedASR: Clarification Needed on Handling of Brace Tokens and Preprocessing Rules for Fine-Tuning & Decoding HAI-DEF model	2	109	February 24, 2026
Questions from Mars Petcare Documentation gemini-15 , api , models	0	54	October 14, 2025

MedGemma finetuning - padding and labels' masking

Related topics