Background
I’m building an automated visual book-to-graphic-novel pipeline using n8n workflow automation and the Gemini 3 Pro Image API. I write books and want to convert them into 20-30 page visual graphic novels with consistent characters, style, and visual continuity.
What Works (AI Studio)
In AI Studio, the workflow is seamless:
-
I provide the full book context/storyline upfront
-
I say: “Generate the cover for this book” → AI generates it
-
I say: “Generate page 1” → AI generates it with consistent style and characters
-
I say: “Generate page 2” → AI maintains context of previous pages and style
-
This continues for 20+ pages with perfect consistency
The key: AI Studio seems to maintain persistent visual context throughout the entire conversation without me needing to re-upload reference images or previous pages.
What Doesn’t Work (API + n8n Automation)
When building this via the Gemini API with n8n automation, I cannot replicate this behavior. Here’s what I’ve tried:
Attempt 1: Multi-turn conversation with contents array
-
Following the multi-turn image editing docs
-
ive tried Manually building a
conversationHistoryarray with all previous messages -
also tried attaching each image problem: Hitting the 14-image attachment limit around page 12-14 when including all generated pages in history
Attempt 2: Context Caching
-
Attempted to use context caching to store character references and style
-
Problem: Context caching is not available for
gemini-3-pro-image-previewmodel (only text models )
My Questions
-
How does AI Studio maintain visual context across 20+ image generations? Is it using a different API endpoint or method that’s not documented in the public API docs?
-
Is there a way to use the Chat/multi-turn functionality that maintains visual memory without hitting attachment limits?
-
Should I be using a different model or approach for this use case? The goal is: upload context once → generate 20-30 sequential images with consistent characters/style.
What I Need
A method to replicate AI Studio’s behavior programmatically:
-
Provide full story context + character references once
-
Generate pages 1-30 sequentially via API calls
-
Each page maintains visual consistency with previous pages
-
No need to re-upload references or regenerated pages with each request
Any guidance, code examples, or documentation pointers would be incredibly helpful!