Why can't I get Gemini to recognize "strikethrough" text in an image

Ron_Parker · July 13, 2024, 5:24am

Thanks @Diet !!

I finally got something working. Yes, using Claude through Anthropic[Vertex] appears to recognize the strikeout.

I am using code similar to this:

import base64
import httpx
from anthropic import AnthropicVertex

LOCATION="europe-west1" # or "us-east5"

client = AnthropicVertex(region=LOCATION, project_id="PROJECT_ID")

image1_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image1_media_type = "image/jpeg"
image1_data = base64.b64encode(httpx.get(image1_url).content).decode("utf-8")

message = client.messages.create(
  max_tokens=1024,
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": image1_media_type,
            "data": image1_data,
          },
        },
        {
          "type": "text",
          "text": "Describe this image."
        }
      ],
    }
  ],
  model="claude-3-5-sonnet@20240620",
)
print(message.model_dump_json(indent=2))

Which can be found in this Google Cloud documentation: Google Cloud console

Last question: I need to extract text from multi-page PDFs. So, task 1 is to convert each page to an image. However, I need to know how to send multiple images to Claude in one API call.

Suggestions?

Topic		Replies	Views
Gemini Model Unable to Extract Text from Uploaded Image, Requests Direct Text Input Instead Gemini API api	1	119	October 23, 2024
Gemini Pro unable to transcribe text in images Community feedback	12	445	May 9, 2024
Gemini API - Still No Text Completion? Gemini API gemini-15 , ai-studio , api , models	1	155	August 12, 2024
Gemini 1 building in html, PYTHON and other types language Gemini API gemini-15 , tfjs , datasets	1	33	September 4, 2024
Working with gemini AI and nextjs Gemini API	0	48	July 5, 2024

Why can't I get Gemini to recognize "strikethrough" text in an image

Related topics