Batch Mode with Gemini API

Hello!
I am trying to process large amount of data using Batch Model in the Gemini API, but I am facing an issue when I have to analyse images. I see that in order to do that I need to upload first to the Files API and then include to the JSONL. Is there a way to skip the part where I need to upload the images, but instead pass the base64 or the image url directly to the request?
this is what google has provided, I tried with base64 and url but no luck from my side. maybe I am missing something.

# Download sample image

from IPython.display import Image

image_path = "jetpack.jpg"
!wget https://storage.googleapis.com/generativeai-downloads/images/jetpack.jpg -O {image_path}  -q

print(f"Uploading image file: {image_path}")
image_file = client.files.upload(
    file=image_path,
)
print(f"Uploaded image file: {image_file.name} with MIME type: {image_file.mime_type}")
Image(filename=image_path)

requests_data = [
    # First request: simple text prompt
    {"key": "request_1", "request": {"contents": [{"parts": [{"text": "Explain how AI works in a few words"}]}]}},
    # Second request: multi-modal prompt with text and an image reference
    {
        "key": "request_2_image",
        "request": {
            "contents": [{
                "parts": [
                    {"text": "What is in this image? Describe it in detail."},
                    {"file_data": {"file_uri": image_file.uri, "mime_type": image_file.mime_type}}
                ]
            }]
        }
    }
]


json_file_path = 'batch_requests_with_image.json'

print(f"\nCreating JSONL file: {json_file_path}")
with open(json_file_path, 'w') as f:
    for req in requests_data:
        f.write(json.dumps(req) + '\n')

print(f"Uploading JSONL file: {json_file_path}")
batch_input_file = client.files.upload(
    file=json_file_path
    )
print(f"Uploaded JSONL file: {batch_input_file.name}")

print("\nCreating batch job...")
batch_job_from_file = client.batches.create(
    model=MODEL_ID,
    src=batch_input_file.name,
    config={
        'display_name': 'my-batch-job-with-image',
    }
)
print(f"Created batch job from file: {batch_job_from_file.name}")
print("You can now monitor the job status using its name.")
1 Like

@Nuni_Telo,
you can convert the images into base64 strings and add thsi strings directly to the text(prompt)

check the following code

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return encoded_string 

prompt= f"What is in the following image? Describe it in detail. {encoded_string} "

that should fix the issue.
Thankyou