Integrating MedGemma with LangChain Challenges and Solutions

Hi everyone, this is Zeerak from Spryt.

We’re working on a multi-agent healthcare system for appointment scheduling and patient support, built using LangChain and LangGraph. Our system uses a sophisticated architecture where different specialized agents (scheduling, rescheduling, cancellation, FAQ) collaborate to handle patient interactions for cervical screening appointments.

Our Setup:

  • Multi-agent system using LangChain/LangGraph
  • Currently supports Anthropic (Claude), Google (Gemini), and Anthropic Vertex models
  • Deployed on Google Cloud using dedicated Vertex AI endpoints
  • Production system handling real patient interactions

The Challenge:

We want to integrate MedGemma into our system, but we’re facing several technical hurdles:

1. Dedicated Endpoint Issue

We’ve discovered that LangChain’s existing Gemma integration (GemmaChatVertexAIModelGarden) doesn’t work with dedicated Vertex AI endpoints. The current implementation is designed for the Model Garden but not for custom DNS endpoints like mg-endpoint-xxxx.europe-west4-xxxx.prediction.vertexai.goog.

2. System Message Support

We’ve confirmed that Gemma models don’t support system messages - they only accept the chat format with alternating user/model turns. Our agents rely heavily on system prompts to define behavior, context, and response formats.

3. Tool Calling

Our agents use LangChain’s tool calling framework extensively. Since MedGemma doesn’t have native tool/function calling support, we need to implement a text-based workaround where:

  • Tool definitions are injected into prompts
  • Tool calls are wrapped in special markers (e.g., ```tool_code```)
  • Responses are parsed to extract tool calls and convert them to LangChain’s expected format

What We’re Looking For:

  1. Has anyone successfully integrated MedGemma with LangChain using dedicated Vertex AI endpoints?
  2. Are there existing patterns or wrappers for handling tool calling with MedGemma in a LangChain context?
  3. Any best practices for converting system messages + tool definitions into MedGemma’s chat format while maintaining conversation coherence?

We’re planning to build a custom wrapper that:

  • Connects to our dedicated endpoint using the OpenAI client approach
  • Converts LangChain messages (including system and tool messages) to MedGemma format
  • Implements text-based tool calling with reliable parsing
  • Maintains compatibility with LangChain’s agent framework

Any insights, code examples, or similar experiences would be greatly appreciated! Happy to share our solution once we get it working.

Thanks!