Question on pre-training for medical tasks

NicoleMcNally · November 11, 2025, 1:22pm

In your technical report you articulate that mixing general-purpose data during the ‘Vision Encoder Enhancement’ and ‘Multimodal Decoder Pretraining’ stages was crucial ‘to preserve the visual-language reasoning capabilities’ and ensure MedGemma retained ‘strong general-purpose capabilities’. Curious whether in your development process, you explored or considered scenarios where the models were trained exclusively on medical data? If so, what were the observed or anticipated trade-offs?

Specifically, would such an approach lead to a broader degradation of ‘overall reasoning capabilities,’ or would the impact be more nuanced, primarily affecting the model’s ability to reason about ‘general’ visual-language tasks while potentially enhancing ‘specialized’ medical reasoning?

Topic		Replies	Views
Are there tasks where multimodal integration hinders performance? HAI-DEF medgemma	0	4	November 11, 2025
Questions from Mars Petcare Documentation gemini-15 , api , models	0	34	October 14, 2025
[Research :tada:] Finetuned Language Models Are Zero-Shot Learners (by Google Research) TensorFlow education	3	1605	September 7, 2021
How to continue training a pre-trained model? General Discussion models , keras , help_request	3	2365	September 2, 2021
[Research :tada:] Video-Audio-Text Transformer (VATT) for multimodal self-supervised learning from raw video, audio and text (with TensorFlow code) Show and Tell education	0	2455	December 10, 2021

Question on pre-training for medical tasks

Related topics