Hello! I need REST API for image to video generation in IOS application. But in doc it’s not mentioned. Only text to video REST api is available any other way how we could integrated image to video using veo model.
Hey! You can pass image references through the API, here are the docs: https://ai.google.dev/gemini-api/docs/video?example=dialogue#reference-images
Is this what you are looking for?
Yes. But REST Api only supports text to video. Not supports image to video. In doc also it’s not mentioned.
Wouldn’t this snippet from the docs make the trick? Here you would be sending three images, and generating a video based on them. You can modify it to only use one if that’s what you want
# Note: This script uses jq to parse the JSON response.
# It assumes dress_image_base64, glasses_image_base64, and woman_image_base64
# contain base64-encoded image data.
# GEMINI API Base URL
BASE_URL="https://generativelanguage.googleapis.com/v1beta"
# Send request to generate video and capture the operation name into a variable.
operation_name=$(curl -s "${BASE_URL}/models/veo-3.1-generate-preview:predictLongRunning" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-X "POST" \
-d '{
"instances": [{
"prompt": "The video opens with a medium, eye-level shot of a beautiful woman with dark hair and warm brown eyes. She wears a magnificent, high-fashion flamingo dress with layers of pink and fuchsia feathers, complemented by whimsical pink, heart-shaped sunglasses. She walks with serene confidence through the crystal-clear, shallow turquoise water of a sun-drenched lagoon. The camera slowly pulls back to a medium-wide shot, revealing the breathtaking scene as the dress'\''s long train glides and floats gracefully on the water'\''s surface behind her. The cinematic, dreamlike atmosphere is enhanced by the vibrant colors of the dress against the serene, minimalist landscape, capturing a moment of pure elegance and high-fashion fantasy."
}],
"parameters": {
"referenceImages": [
{
"image": {"inlineData": {"mimeType": "image/png", "data": "'"$dress_image_base64"'"}},
"referenceType": "asset"
},
{
"image": {"inlineData": {"mimeType": "image/png", "data": "'"$glasses_image_base64"'"}},
"referenceType": "asset"
},
{
"image": {"inlineData": {"mimeType": "image/png", "data": "'"$woman_image_base64"'"}},
"referenceType": "asset"
}
]
}
}' | jq -r .name)
# Poll the operation status until the video is ready
while true; do
# Get the full JSON status and store it in a variable.
status_response=$(curl -s -H "x-goog-api-key: $GEMINI_API_KEY" "${BASE_URL}/${operation_name}")
# Check the "done" field from the JSON stored in the variable.
is_done=$(echo "${status_response}" | jq .done)
if [ "${is_done}" = "true" ]; then
# Extract the download URI from the final response.
video_uri=$(echo "${status_response}" | jq -r '.response.generateVideoResponse.generatedSamples[0].video.uri')
echo "Downloading video from: ${video_uri}"
# Download the video using the URI and API key and follow redirects.
curl -L -o veo3.1_with_reference_images.mp4 -H "x-goog-api-key: $GEMINI_API_KEY" "${video_uri}"
break
fi
# Wait for 10 seconds before checking again.
sleep 10
done