Gemini 2.0 flash - 1.5 pro Struggles with Basic Task Execution

Mohamed_Amine · February 26, 2025, 9:23am

Gemini has been touted as a powerful multimodal model, yet in real-world use cases, it frequently fails at executing structured, rule-based tasks. A prime example is its inability to consistently process bank statements following strict formatting and validation rules.

Test Scenario: Extracting and Structuring Bank Transactions

The task given to Gemini involved extracting transaction data from bank statements (PDFs or images) and formatting them into JSON according to a rigid set of rules. Every transaction needed dual JSON outputs (Version 1 and Version 2) with precise date formatting, amount processing, and transaction categorization.

Key Failures Observed

Forgetting Instructions Midway

Despite explicitly detailed instructions, Gemini often ignored key steps, leading to incomplete or incorrect outputs.

It failed to consistently generate both required JSON versions, sometimes omitting one entirely.

Inconsistent Data Processing

Certain transactions were misclassified (e.g., a debit was marked as a credit).

It occasionally misinterpreted dates, failing to apply the correct YYYY-MM-DD format.

Amounts ending in “.000” (e.g., “40.000”) were sometimes left unchanged, despite clear rules to remove the suffix.

Logical Errors in Transaction Handling

For specific amounts requiring TVA splitting (e.g., 0.595), it sometimes created incorrect JSON structures.

The validation rules were ignored in some cases, leading to missing or misplaced fields in the final JSON.

Inability to Correct Its Own Mistakes

Has anyone else experienced these kinds of issues with Gemini, or is it just me?

Mrinal_Ghosh · May 19, 2025, 9:31am

@Mohamed_Amine Thank you for the feedback .

We recommend that you utilize the updated Gemini models for improved performance and access to the latest enhancements. Please try your task using one of the newer models, such as Gemini 2.5 Pro or Gemini 2.5 Flash, which incorporate our most recent advancements in reasoning and capabilities.

After testing with an updated Gemini model, please let us know if the issue you were experiencing persists. Providing us with this information will help us further diagnose and resolve any ongoing problems. Your feedback is valuable in ensuring the optimal performance of our models.

Topic		Replies	Views
Structured output - poor consistency Gemini API gemini-15 , feedback , api , models , gemini-flash	0	162	March 15, 2025
Gemini Flash Model Ignoring JSON Schema in Prompts Gemini API gemini-15 , api , models , gemini-api	2	275	November 21, 2024
Gemini-exp-1206 feedback Gemini API feedback , gemini-flash	2	553	December 23, 2024
Critically poor performance of the latest gemini-2.5 model Google AI Studio models	4	733	April 14, 2025
Gemini-1.5-pro-latest performs WORSE since yesterday. How to use its previous version? Gemini API	35	845	September 2, 2024

Gemini 2.0 flash - 1.5 pro Struggles with Basic Task Execution

Related topics