I’m using the Gemini 1.5 Pro EXP model API with the image shown below, but I’m encountering the same issue. Here’s the error I’m receiving:
Response:
python
GenerateContentResponse(
done=True,
iterator=None,
result=protos.GenerateContentResponse({
"candidates": [
{
"finish_reason": "RECITATION",
"index": 0
}
],
"usage_metadata": {
"prompt_token_count": 407,
"total_token_count": 407
}
}),
)
Here’s the code I’m using:
import google.generativeai as genai
import os
import PIL.Image
from dotenv import load_dotenv
from google.generativeai.types import HarmCategory, HarmBlockThreshold
load_dotenv()
# Configure the API key
genai.configure(api_key=os.environ["GEMINI_API_KEY4"])
# Define generation configuration
generation_config = {
"temperature": 0,
"top_p": 0.95,
"top_k": 64,
"max_output_tokens": 8192,
"response_mime_type": "text/plain",
}
# Create a GenerativeModel instance
model = genai.GenerativeModel(
model_name="gemini-1.5-pro-exp-0827",
generation_config=generation_config,
safety_settings={
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
},
system_instruction="""
Extract questions and their corresponding options (if applicable) from an image of a question paper, including any directions, instructions, and question numbers. The extracted content should be formatted in LaTeX code, preserving the original structure and layout of the question paper.
- **Questions**: Format each question with its number.
- **Options**: Use the enumerate environment for multiple-choice options, labeling each option as (a), (b), (c), etc.
- **Diagrams**: If a question includes a diagram, use the TikZ package to recreate the diagram in LaTeX. The TikZ code should be appropriately placed within the LaTeX document, ensuring the diagram aligns with the relevant question.
"""
)
# Path to the images folder
images_folder = "images"
image_files = [f for f in os.listdir(images_folder) if f.endswith('.png')]
# Process each image file
for image_file in image_files:
image_path = os.path.join(images_folder, image_file)
sample_image = PIL.Image.open(image_path)
# Generate content based on the image
response = model.generate_content([sample_image])
# Print the generated LaTeX code or handle the response as needed
print(response)
When I use the same image and system instruction in the Gemini AI Studio Playground, I receive the correct response. You can see the following images:
1.The first image shows the API response.
-
The second image shows the Gemini AI Studio Playground response.
-
The third image shows the safety settings.
-
The fourth image is the sample image to process.