models.generateContent files with 500

Hi, all!

In my tests models.generateContent returns a 500 error code 100% of the time. See the reproduction script below.

The script effectively creates a contentCache containing 9MB of text (e.g. SQLite3 C code) then asks the model a simple question about it. The script uses basic tools like jq, tar and curl to make it easy to run it.

#!/usr/bin/env bash

pushd /tmp

curl https://sqlite.org/2024/sqlite-autoconf-3460000.tar.gz > sqlite-autoconf-3460000.tar.gz
tar xvfz sqlite-autoconf-3460000.tar.gz sqlite-autoconf-3460000/sqlite3.c

jq \
  --null-input \
  --rawfile sqlite3_source sqlite-autoconf-3460000/sqlite3.c \
  '{
    "contents": [
      {
        "parts": [
          {
            "text": $sqlite3_source
          }
        ],
        "role": "user",
      }
    ],
    "model": "models/gemini-1.5-flash-001"
  }' > cached_contents_payload.json

cached_content_name=$(
  curl \
    --request POST \
    --header "x-goog-api-key: $GEMINI_API_KEY" \
    --json @cached_contents_payload.json \
    https://generativelanguage.googleapis.com/v1beta/cachedContents \
  | jq '.name' --raw-output
)

jq \
  --null-input \
  --arg cached_content_name $cached_content_name \
  '{
    "cachedContent": $cached_content_name,
    "contents": [
      {
        "parts": [
          {
            "text": "Source code for what software is this?"
          }
        ],
        "role": "user"
      }
    ]
  }' > generate_content_payload.json

curl \
  --request POST \
  --header "x-goog-api-key: $GEMINI_API_KEY" \
  --json @generate_content_payload.json \
  https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-001:generateContent

# Clean-up
#
curl \
  --verbose \
  --request DELETE \
  --header "x-goog-api-key: $GEMINI_API_KEY" \
  https://generativelanguage.googleapis.com/v1beta/$cached_content_name

rm -r sqlite-autoconf-3460000*

popd

Run it like this:

GEMINI_API_KEY=abcd1234... ./problem.bash

The output is this:

{
  "error": {
    "code": 500,
    "message": "An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting",
    "status": "INTERNAL"
  }
}

Any idea what am I doing wrong? I also tried using a different model, gemini-1.5-pro-001, only to get the same error.

This was a very very odd one.

I broke it down into several parts and noticed this was the output from adding to the cache:

{
  "name": "cachedContents/puc0tdild46n",
  "model": "models/gemini-1.5-flash-001",
  "createTime": "2024-07-21T17:39:29.083777Z",
  "updateTime": "2024-07-21T17:39:29.083777Z",
  "expireTime": "2024-07-21T18:39:28.525834835Z",
  "displayName": "",
  "usageMetadata": {
    "totalTokenCount": 3038202
  }
}

So the total token count is… 3M… if I’m reading it correctly.

It would not surprise me (tho it highly annoys me) if trying to use a cache that is too large for the model would generate a 500 error.

If that’s the issue - Good Catch!

Oh my, thanks for noticing the token number, that would certainly be a plausible explanation for what’s happening!

I certainly saw the big number of token show up, but for some reason did not connect the dots

1 Like