I found an error in Google AI Studio documentation for multimodal Gemini 1.5 models with images or video using curl

Hi,

I’m building an application with AI Studio & Gemini APIs, and I found out the documentation for uploading images to Google API has an error.

When I select “Get Code” from the upper right corner and set my language as curl, add a sample image, some prompt and a system prompt, it gives the following example code:

API_KEY="YOUR_API_KEY"

# TODO: Make the following files available on the local file system.
FILES=("image_animal1.jpeg")
MIME_TYPES=("image/jpeg")
for i in "${!FILES[@]}"; do
  NUM_BYTES=$(wc -c < "${FILES[$i]}")
  curl "https://generativelanguage.googleapis.com/upload/v1beta/files?key=${API_KEY}" \
    -H "X-Goog-Upload-Command: start, upload, finalize" \
    -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
    -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPES[$i]}" \
    -H "Content-Type: application/json" \
    -d "{'file': {'display_name': '${FILES[$i]}'}}" \
    --data-binary "@${FILES[$i]}"
  # TODO: Read the file.uri from the response, store it as FILE_URI_${i}
done

curl \
  -X POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=${API_KEY} \
  -H 'Content-Type: application/json' \
  -d @<(echo '{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "fileData": {
            "fileUri": "${FILE_URI_0}",
            "mimeType": "image/jpeg"
          }
        }
      ]
    },
    {
      "role": "user",
      "parts": [
        {
          "text": "Analyze this image"
        }
      ]
    }
  ],
  "systemInstruction": {
    "role": "user",
    "parts": [
      {
        "text": "My system prompt here"
      }
    ]
  },
  "generationConfig": {
    "temperature": 1,
    "topK": 40,
    "topP": 0.95,
    "maxOutputTokens": 8192,
    "responseMimeType": "text/plain"
  }
}')

But this code doesn’t work. It’s still a TODO like it says. When I upload an jpeg image using this it returns me application/json as the mimeType in the response, which is wrong! Gemini model only works if the mimeType returned is image/jpeg.

I have failed to upload any images to google apis and use them with a gemini model using this code. When I tried to use the images it gave me a plain 400 error INVALID_ARGUMENT, but nothing more specific. I had to look up in the code here how it does the image uploading: generative-ai-js/src/requests/request.ts at 2df2af03bb07dcda23b07af1a7135a8b461ae64e · google-gemini/generative-ai-js · GitHub

And I found out you have to use a multipart content-type header in the request with a boundary. After I did that, my curl requests started working uploading the images.

I hope you can update your API documentation and fix this error so no one else needs to spend their time scratching their head why they receive a 400 status code from the HTTP Gemini API.
Like me and @afirstenberg did over at Error using image and a prompt

Thank you for your time developing AI Studio & Gemini API, I’m otherwise very pleased with using them.

Best regards, Aleksi Lemmetyinen

1 Like