Critical: gemini-3-pro-image-preview returning 400 INVALID_ARGUMENT and intermittent text-only responses

Hi Gemini API Team,

We are running a production image-restyling SaaS and are currently seeing critical reliability issues with the gemini-3-pro-image-preview model. Our system is currently forced into a 6-step fallback chain because the primary API calls are failing.

1. 100% Failure Rate (HTTP 400) via Direct API

Every single request to the direct generativelanguage endpoint fails when including an image, even though text-only calls with the same key succeed.

  • Endpoint: https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent

  • Payload: contents[0].parts = [{text: ~19KB}, {inlineData: {mimeType: "image/png", data: ~1-2MB}}]

  • Result: 100% 400 INVALID_ARGUMENT (20/20 latest attempts).

  • Observation: The same payload succeeds ~75% of the time when routed through a third-party gateway (OpenAI-compatible proxy).

  • Question: Is there a strict (undocumented) token or byte limit for the combined text+image payload on this specific preview model?

2. Intermittent “Text-only” Responses (HTTP 200, but no image)

When the model does respond via gateway, it frequently fails to generate an image modality.

  • Behavior: HTTP 200 OK, but the response contains only text reasoning (describing what it would have generated).

  • Frequency: ~25% of successful calls.

  • Question: Is there a parameter to force image output? We’ve tested responseModalities in different orders (TEXT, IMAGE vs IMAGE, TEXT) without consistent results. Is there a required_modalities or similar flag?

3. Vertex AI Availability & Consistency

We use gemini-3.1-pro-preview successfully on Vertex AI for analysis, but image generation for gemini-3-pro-image-preview seems unavailable or broken on Vertex.

  • Question: Is this model officially supported for image generation on aiplatform.googleapis.com? If so, what is the exact payload structure?

  • Latency: We are seeing validation timeouts (~35s) on Vertex. Is there a recommended timeout setting for multimodal validation tasks?

4. Specific Technical Requests

To stabilize our production environment, we need clarification on:

  • Token Budget: How does a ~2MB inline image calculate against the token limit for this model?

  • Temperature: Is there a recommended temperature (e.g., 0.8) to maximize image generation success vs. text reasoning?

  • Rate Limits: What are the specific RPM/TPM limits for the image-preview model?

  • Error Verbosity: Can you provide more detail for INVALID_ARGUMENT? The current error body is non-actionable.

Impact: Our current workaround takes 50-60s due to multiple fallback steps. We need a reliable “Image In → Image Out” flow to maintain our production SLA.

We can provide Request IDs and full JSON payloads for your internal debugging.

Best regards,

Nichlas

我们在gemini-3.1-pro-preview 也遇到了这个问题