I need urgent clarification on a docs/API mismatch for Veo 3.1. **The Problem:** Your official documentation at https://ai.google.dev/gemini-api/docs/video#using-reference-images shows reference_images as a standard, publicly available feature with complete working examples. However, when I use the EXACT code from your docs, I get: ``` 400 INVALID_ARGUMENT “Your use case is currently not supported. Please refer to Gemini API documentation for current model offering.” ``` **My Verification:**
I have access to veo-3.1-generate-preview (confirmed via models.list())
Basic text-to-video works perfectly
I’m using YOUR exact code pattern from the docs
My code creates proper VideoGenerationReferenceImage objects ✗ ANY request with reference_images gets rejected **Code I’m Using (from YOUR docs):** ```python ref_img_obj = genai.types.VideoGenerationReferenceImage( image=ref_img,reference_type=“asset” ) config = genai.types.GenerateVideosConfig( duration_seconds=8, resolution=“1080p”,aspect_ratio=“9:16”, reference_images=[ref_img_obj] # ← THIS CAUSES ERROR ) operation =client.models.generate_videos( model=“veo-3.1-generate-preview”, source=source, config=config ) ``` **Documentation Shows:** - Parameters table lists `referenceImages` for Veo 3.1
- Complete working examples with NO access restrictions mentioned
- Section “Using reference images” shows it as public feature
**But API Rejects It:** -Error code: 400 INVALID_ARGUMENT
- Message: “use case not supported”
**Questions:** 1. Is reference_images actually available or is the documentation wrong? 2. If it requires special access, why isn’t this documented? 3. How do I request access if it’s restricted? 4. When will docs match API reality? This is blocking production development. My use case: UGC video generation where we need product photos to appear in generated videos (exactly what reference_images is designed for). **Details:** - API Key: [last 4 digits: XXXX] - SDK: google-genai 1.46.0 - Region: Croatia (HR) - Error occurs even with 1 reference image - Error occurs with 720p and 1080p -Error occurs with all durations (4s, 6s, 8s) Please either: A) Fix the documentation to mention access restrictions B) Enable reference_images for my account as shown in docs C) Explain what’s actually required to use this documented feature Thank you.
Hi @Ivan_Duvnjak ,Thank you for bringing this to our attention.
Apologies for the delayed response. Could you please confirm if you are still facing the INVALID_ARGUMENT error?
I’m encountering the same error here. After a few successful requests, I’m now receiving a 400 error indicating that the model is not available for my use case: {‘error’: {‘code’: 400, ‘message’: ‘Your use case is currently not supported. Please refer to Gemini API documentation for current model offering.’, ‘status’: ‘INVALID_ARGUMENT’}}
Hello! It looks like you might be trying to use the 9:16 aspect ratio.referenceImagesonly supports 16:9 at the moment. I will update the API parameters table to make this more clear. We’re working on getting 9:16 supported very soon.
Please let me know if you experience any other issues with your API call
Hi @Alisa_Fortin,
Is there already a specific timeline for when “very soon” will be? Our current use case is blocked by this missing feature.
Regards.
Nils
I was running into a lot of issues using images with REST API calls, and the documentation wasn’t very helpful. I think I got it all worked out though, and created an overview of how it’s working for me with some code snippets for guidance. It includes things like the errors you might be seeing and how we fixed them. Hopefully it’s helpful!
Developer Guide
Google Veo 3.1 API: Complete Guide
A comprehensive guide to using the Google Veo 3.1 video generation API via the Gemini API endpoint (v1beta). This document covers correct request formats for all video generation modes after extensive trial and error.
Why This Guide Exists
The Veo video generation endpoint (predictLongRunning) is available at generativelanguage.googleapis.com but uses Vertex AI request format, not standard Gemini format. This causes significant confusion.
Overview
Different Google APIs use different formats:
-
Gemini API (
generateContent) - usesinlineDataformat -
Vertex AI (
predictLongRunning) - usesbytesBase64Encodedformat -
Files API - uses
fileUriformat
Key insight: Use bytesBase64Encodedwith mimeTypefor all image data.
Model IDs
The Gemini API and Vertex AI use different model ID suffixes:
| Model | Gemini API | Vertex AI |
|---|---|---|
| Veo 3.1 Standard | veo-3.1-generate-preview |
veo-3.1-generate-001 |
| Veo 3.1 Fast | veo-3.1-fast-generate-preview |
veo-3.1-fast-generate-001 |
| Veo 3.0 Standard | veo-3.0-generate-001 |
veo-3.0-generate-001 |
| Veo 3.0 Fast | veo-3.0-fast-generate-001 |
veo-3.0-fast-generate-001 |
Using -001models with Gemini API returns 404 errors.
Common Errors
Error 1: Model not found (404)
{
"error": {
"code": 404,
"message": "models/veo-3.1-generate-001 is not found"
}
}
Cause: Using Vertex AI model IDs (-001) with Gemini API. Use -preview suffix instead.
Error 2: inlineData not supported (400)
{
"error": {
"code": 400,
"message": "`inlineData` isn't supported by this model."
}
}
Cause: Using Gemini’s inlineData format with data field. Use **bytesBase64Encoded**instead.
Error 3: fileUri not supported (400)
{
"error": {
"code": 400,
"message": "`fileUri` isn't supported by this model."
}
}
Cause: Uploading to Files API and using fileUri reference. Use inline base64 instead.
Error 4: Unknown fields (400)
{
"error": {
"code": 400,
"message": "Invalid JSON payload received. Unknown name \"image\": Cannot find field."
}
}
Cause: Using flat request body instead of instances + parameters structure.
Error 5: Invalid lastFrame (400)
{
"error": {
"code": 400,
"message": "Invalid value at 'parameters.lastFrame'"
}
}
Cause: Placing lastFrame in parameters instead of instances[0], or using nested image wrapper.
API Endpoint
POST https://generativelanguage.googleapis.com/v1beta/models/{model}:predictLongRunning
Headers:
x-goog-api-key: YOUR_API_KEY
Content-Type: application/json
Request Structure
All requests use the instances + parameters structure:
{
"instances": [
{
"prompt": "...",
// image data goes here
}
],
"parameters": {
"aspectRatio": "16:9",
"resolution": "720p",
"durationSeconds": 8,
"sampleCount": 1
}
}
Video Generation Modes
1. Text-to-Video (No Images)
{
"instances": [
{
"prompt": "A serene mountain landscape at golden hour with clouds drifting slowly"
}
],
"parameters": {
"aspectRatio": "16:9",
"resolution": "720p",
"durationSeconds": 8,
"sampleCount": 1
}
}
2. First Frame Only (Image-to-Video)
{
"instances": [
{
"prompt": "Camera slowly pans across the scene as light shifts",
"image": {
"mimeType": "image/jpeg",
"bytesBase64Encoded": "/9j/4AAQSkZJRgABAQAA..."
}
}
],
"parameters": {
"aspectRatio": "16:9",
"resolution": "720p",
"durationSeconds": 8,
"sampleCount": 1
}
}
3. First + Last Frame Interpolation
Critical: lastFramemust be in instances[0], NOT in parameters. No nested imagewrapper.
{
"instances": [
{
"prompt": "Smooth cinematic transition between the two scenes",
"image": {
"mimeType": "image/jpeg",
"bytesBase64Encoded": "/9j/4AAQSkZJRgABAQAA..."
},
"lastFrame": {
"mimeType": "image/jpeg",
"bytesBase64Encoded": "/9j/4AAQSkZJRgABAQAA..."
}
}
],
"parameters": {
"aspectRatio": "16:9",
"resolution": "720p",
"durationSeconds": 8,
"sampleCount": 1
}
}
4. Reference Images (Style/Content Guidance)
Reference images guide the style and content of generated video. Only supported on Veo 3.1.
{
"instances": [
{
"prompt": "A woman in a red dress walking through a garden",
"referenceImages": [
{
"referenceType": "asset",
"image": {
"bytesBase64Encoded": "/9j/4AAQSkZJRgABAQAA...",
"mimeType": "image/jpeg"
}
}
]
}
],
"parameters": {
"aspectRatio": "16:9",
"resolution": "720p",
"durationSeconds": 8,
"sampleCount": 1
}
}
5. Video Extension
Extend an existing video by providing the video URI from a previous generation.
Extension Rules:
-
Each extension adds 7 seconds to the video
-
Can chain up to 20 times (max ~148 seconds total)
-
Videos stored on server for 2 days - must extend within this window
-
aspectRatio and resolution must match the original video
{
"instances": [
{
"prompt": "The action continues as the character walks forward",
"video": {
"uri": "https://generativelanguage.googleapis.com/v1beta/..."
}
}
],
"parameters": {
"aspectRatio": "16:9",
"resolution": "720p",
"sampleCount": 1
}
}
Image Placement Reference
| Image Type | Location | Structure |
|---|---|---|
| First frame | instances[0].image |
{ mimeType, bytesBase64Encoded } |
| Last frame | instances[0].lastFrame |
{ mimeType, bytesBase64Encoded } |
| Reference images | instances[0].referenceImages[] |
[{ referenceType: "asset", image: {...} }] |
| Extension video | instances[0].video |
{ uri } |
Key Points & Gotchas
1. Use bytesBase64Encoded, NOT inlineData
Wrong (Gemini format):
{
"image": {
"inlineData": {
"mimeType": "image/jpeg",
"data": "base64..."
}
}
}
Correct (Vertex AI format):
{
"image": {
"bytesBase64Encoded": "base64...",
"mimeType": "image/jpeg"
}
}
2. Use lowercase "asset" for referenceType
The API is case-sensitive:
"referenceType": "ASSET" - Wrong
"referenceType": "asset" - Correct
3. lastFrame has NO nested image wrapper
Wrong:
{
"lastFrame": {
"image": {
"mimeType": "image/jpeg",
"bytesBase64Encoded": "..."
}
}
}
Correct:
{
"lastFrame": {
"mimeType": "image/jpeg",
"bytesBase64Encoded": "..."
}
}
4. Additional Tips
-
Use
16:9aspect ratio for reference images until you confirm everything works -
Keep images under 1MB each - large payloads can cause gateway errors
-
Use
instances+parametersstructure, NOT flat request body
Format Comparison
| Format | Field | Structure | Supported by Veo? |
|---|---|---|---|
| Gemini | inlineData |
{ data, mimeType } |
NO |
| Files API | fileUri |
{ fileUri } |
NO |
| Vertex AI | bytesBase64Encoded |
{ bytesBase64Encoded, mimeType } |
YES |
Model Capabilities
| Model | First Frame | Last Frame | Reference Images | Video Extension | Max Duration |
|---|---|---|---|---|---|
| Veo 3.1 Standard | Yes | Yes | Yes (up to 3) | Yes | 8s |
| Veo 3.1 Fast | Yes | Yes | No | Yes | 8s |
| Veo 3.0 Standard | Yes | No | No | Yes | 8s |
| Veo 3.0 Fast | Yes | No | No | Yes | 8s |
Summary
-
Use
-previewmodel IDs for Gemini API (veo-3.1-generate-preview) -
Use
bytesBase64Encodedformat for all images, notinlineData -
Wrap requests in
instances+parametersstructure -
Place
lastFramein instance level, not in parameters -
No nested
imagewrapper for lastFrame -
Use lowercase
"asset"for reference image type -
For video extension, place video URI in
instances[0].video.uri
wow - this was super super super helpful buddy. Thanks for sharing ![]()
Couple more points I observed:
- Both for Veo 3.1 and Veo 3.1 Fast
lastFrameonly works whenimageis present as well.lastFramedoesn’t work solo. - Both for Veo 3.1 and Veo 3.1 Fast the pair
imageandlastFrameonly works for the 8 seconds duration. Not for 6 seconds or 4 seconds duration. referenceImagesonly works for 8 seconds duration. Not for 6 seconds or 4 seconds.referenceImagesonly works when bothimageandlastFrameare absent.