Very specific issue, but I do not get this error when:
- N=1
- There is no JSON content in the prompt.
Is it just me? Anyone able to reproduce?
import json
from vertexai.preview.generative_models import GenerativeModel, GenerationConfig
# Create a JSON structure large enough to trigger the error (~100K chars)
items = []
for i in range(1600):
items.append({
"id": i,
"name": f"Item {i}",
})
test_json = json.dumps({"items": items}, indent=4)
print(f"JSON size: {len(test_json):,} characters")
# Initialize model
model = GenerativeModel("gemini-2.0-flash")
# Test: JSON with candidate_count=2 (will fail)
prompt = f"Analyze this JSON data:\n\n{test_json}"
# The key setting that triggers the error is candidate_count > 1
response = model.generate_content(
prompt,
generation_config=GenerationConfig(candidate_count=2)
)
Hi @Mika_Myrseth
As you’re using Vertex AI I was curious whether it’s related to the platform. Here’s the same example for Google AI using the newer google-genai package.
import os
import json
from google import genai
from google.genai import types
from dotenv import load_dotenv
from Imagen import response
load_dotenv() # take environment variables from .env.
client = genai.Client(
api_key=os.environ['GOOGLE_API_KEY']
)
# Create a JSON structure large enough to trigger the error (~100K chars)
items = []
for i in range(1600):
items.append({
"id": i,
"name": f"Item {i}",
})
test_json = json.dumps({"items": items}, indent=4)
print(f"JSON size: {len(test_json):,} characters")
# Test: JSON with candidate_count=2 (will fail)
prompt = f"Analyze this JSON data:\n\n{test_json}"
# The key setting that triggers the error is candidate_count > 1
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt,
config=types.GenerateContentConfig(
candidate_count=1,
)
)
print(response.text)
It’s giving me the same HTTP 400 error when candidate_count
is > 1.
google.genai.errors.ClientError: 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'Request contains an invalid argument.', 'status': 'INVALID_ARGUMENT'}}
The token count (prompt ~ 46k) looks OK though.
However, using a different model works.
- gemini-2.0-flash Error
- gemini-2.0-flash-exp Error
- gemini-2.0-flash-thinking-exp-01-21 OK
- gemini-2.0-flash-thinking-exp-1219 OK
- gemini-exp-1206 Timeout and OK
- gemini-2.0-pro-exp-02-05 OK
- gemini-2.0-flash-lite Error
OK results come with the following note
there are 2 candidates, returning text from the first candidate.Access response.candidates directly to get text from other candidates.
Seems that it is model-related.
Cheers
1 Like
Very interesting, thanks for testing!
1 Like