How to correctly structure object if using first and last frames and reference images for Veo 3.1 endpoint?

I need a raw request example for creating a video. I couldn’t find a description of the request fields in the documentation.

Could you please share the documentation or a working payload example for this Veo 3.1 API? Any help would be greatly appreciated.

I have a huge request for the developers to add a description to the documentation. Overall, it would be great to have a link to the query structure for each model. The current description for Veo is far from user-friendly.

Solved the problem. Maybe someone will find this solution useful.

Image to video generation

{
  "instances": [
    {
      "prompt": "Panning wide shot of a calico kitten sleeping in the sunshine",
      "image": {
        "bytesBase64Encoded": "image",
        "mimeType": "image/png"
      }
    }
  ],
  "parameters": {
    "aspectRatio": "16:9",
    "resolution": "720p",
    "durationSeconds": 4,
    "sampleCount": 1
  }
}

Using first and last frames

Worker request:

{
  "instances": [
    {
      "prompt": "A cinematic, haunting video. A ghostly woman with long white hair and a flowing dress swings gently on a rope swing beneath a massive, gnarled tree in a foggy, moonlit clearing. The fog thickens and swirls around her, and she slowly fades away, vanishing completely. The empty swing is left swaying rhythmically on its own in the eerie silence.",
      "image": {
        "bytesBase64Encoded": "first_image",
        "mimeType": "image/png"
      },
      "lastFrame": {
        "bytesBase64Encoded": "last_image",
        "mimeType": "image/png"
      }
    }
  ],
  "parameters": {
    "aspectRatio": "16:9",
    "resolution": "720p",
    "durationSeconds": 8,
    "sampleCount": 1
  }
}

Using reference images

Worker request:

{
  "instances": [
    {
      "prompt": "The video opens with a medium, eye-level shot of a beautiful woman with dark hair and warm brown eyes. She wears a magnificent, high-fashion flamingo dress with layers of pink and fuchsia feathers, complemented by whimsical pink, heart-shaped sunglasses. She walks with serene confidence through the crystal-clear, shallow turquoise water of a sun-drenched lagoon. The camera slowly pulls back to a medium-wide shot, revealing the breathtaking scene as the dress's long train glides and floats gracefully on the water's surface behind her. The cinematic, dreamlike atmosphere is enhanced by the vibrant colors of the dress against the serene, minimalist landscape, capturing a moment of pure elegance and high-fashion fantasy.",
      "referenceImages": [
        {
          "image": {
            "bytesBase64Encoded": "dress_image",
            "mimeType": "image/png"
          },
          "referenceType": "asset"
        },
        {
          "image": {
            "bytesBase64Encoded": "glasses_image",
            "mimeType": "image/png"
          },
          "referenceType": "asset"
        },
        {
          "image": {
            "bytesBase64Encoded": "woman_image",
            "mimeType": "image/png"
          },
          "referenceType": "asset"
        }
      ]
    }
  ],
  "parameters": {
    "aspectRatio": "16:9",
    "resolution": "720p",
    "durationSeconds": 8,
    "sampleCount": 1
  }
}

But error: Your prompt conflicted with our safety policies, so the video was not created. Please modify your request and resubmit; you have not been charged.

It’s funny that you can’t use an example.