Gemini Flash 2.0 Flash, Structured Output performs poorly, ignores 'description' in schema

I used “Gemini 1.5 Flash” via API to extract data from PDF files. I also used Structured Output, as I expected to receive specific fields in the response.

After receiving an email from Google about “We’re discontinuing certain Gemini 1.5 models starting May 2025,” I started exploring other models.

I tried “Gemini Flash 2.0 Flash”, but after simply replacing the model, the results were cut in half. This is strange, considering Gemini Flash 2.0 Flash was announced as an improved version of 1.5.

At the same time, “Gemini Flash 2.0 Flash Lite” still returns good data.
Example of pdf file. https://publicity.businessportal.gr/api/download/YMSdata/100784?companyId=154490904000 ( pdf, public Company registration file in Greek )
Prompt:

Extract data from the document following the descriptions in json schema

Structured Output Schema:

{
  "type": "object",
  "description": "This is a json schema that defines how to extract company details from a document that includes data from a Company Registration Service or a similar governmental or state agency Service.",
  "properties": {
    "company_name": {
      "type": "string",
      "description": "Identify and provide the Name of the Company that this registry document is referring to."
    },
    "company_trade_name": {
      "type": "string",
      "description": "Identify and provide the Company Trade Name of the Company."
    },
    "company_address": {
      "type": "string",
      "description": "Identify and provide the registered address of the Company."
    },
    "company_registration_number": {
      "type": "string",
      "description": "Identify and provide the Registration Number of the Company."
    },
    "company_registration_country": {
      "type": "string",
      "description": "Identify and provide the registration Country of the Company."
    },
    "company_file_clarifications": {
      "type": "string",
      "description": "If the document is unclear or ambiguous regarding any of the above, please state the specific ambiguity and where it occurs in the document. Explain any related information that might be helpful."
    },
    "company_file_summarized_text": {
      "type": "string",
      "description": "Make a summary of the document in English."
    }
  }
}

Result for Gemini 2.0 Flash:

{
	"company_name": "ΕΤΑΙΡΙΑ ΜΕΛΕΤΩΝ ΥΠΗΡΕΣΙΩΝ ΚΑΙ ΛΟΓΙΣΜΙΚΟΥ ΓΕΩΧΩΡΙΚΗΣ ΠΛΗΡΟΦΟΡΙΑΣ Ε.Ε.",
	"company_trade_name": "KIKLO "
}

Result for Gemini 2.0 Flash-Lite :

{
	"company_address": "Εγνατίας 154, ΔΕΘ Περίπτερο 1, Θεσσαλονίκη, 54636",
	"company_file_clarifications": "The document refers to an \"ΕΤΑΙΡΙΑ ΜΕΛΕΤΩΝ ΥΠΗΡΕΣΙΩΝ ΚΑΙ ΛΟΓΙΣΜΙΚΟΥ ΓΕΩΧΩΡΙΚΗΣ ΠΛΗΡΟΦΟΡΙΑΣ Ε.Ε.\" (Company of Studies, Services, and Software of Geospatial Information E.E.) and its trade name is \"KIKLO\".",
	"company_file_summarized_text": "This document is a registration of the company  \"ΕΤΑΙΡΙΑ ΜΕΛΕΤΩΝ ΥΠΗΡΕΣΙΩΝ ΚΑΙ ΛΟΓΙΣΜΙΚΟΥ ΓΕΩΧΩΡΙΚΗΣ ΠΛΗΡΟΦΟΡΙΑΣ Ε.Ε.\" (Company of Studies, Services, and Software of Geospatial Information E.E.)  !!!! CUT manually by me !!!! ",
	"company_name": "ΕΤΑΙΡΙΑ ΜΕΛΕΤΩΝ ΥΠΗΡΕΣΙΩΝ ΚΑΙ ΛΟΓΙΣΜΙΚΟΥ ΓΕΩΧΩΡΙΚΗΣ ΠΛΗΡΟΦΟΡΙΑΣ Ε.Ε.",
	"company_registration_country": "Greece",
	"company_registration_number": "154490904000",
	"company_trade_name": "KIKLO"
}

I got a similar result with Gemini 1.5 Flash.

For other document types with different response schemas, the problem remains the same. It looks like Gemini 2.0 Flash skips the “description” field.

2 Likes

Hi @Andrii_Vovk , Welcome to the forum.

You are right, gemini-2.0-flash seems to perform poorly compared to 2.0-flash-lite in terms of getting structured output. I checked with the 2.0-pro-exp model, and it worked as expected.
By the way, do you have the English version of the PDF? I just wanted to test if it is a language-related issue.

2 Likes

Hi @GUNAND_MAYANGLAMBAM. Thanks for your attention. You are right about the 2.0-pro-exp model — it worked as expected. However, there are several reasons why I do not want to use it:

  • it is an experimental.
  • even if it is free now it has limitation 2 RPM 50 req/day.
  • latency (executing time 15 - 33 seconds in my tests).

These factors make it unsuitable for use in a SaaS environment.

I found an English version of the PDF, but it is for a different company: https://www.tbgs.co.uk/wp-content/uploads/2018/04/certificate-of-incorporation.pdf

And again gemini-2.0-flash result is poorly compared to 2.0-flash-lite or 2.0-pro-exp.
gemini-2.0-flash result:

{
	"company_address": "Shiphay Manor\nTorquay\nDevon\nTQ2 7EL",
	"company_name": "TORQUAY BOYS' GRAMMAR SCHOOL",
	"company_registration_country": "England and Wales",
	"company_registration_number": "7394671"
}

2.0-flash-lite result:

{
	"company_address": "SHIPHAY MANOR\nTORQUAY\nDEVON\nTQ2 7EL",
	"company_file_clarifications": "This document is a Certificate of Incorporation for a Private Limited Company.\n\nThe document also includes:\n- Memorandum of Association of TORQUAY BOYS' GRAMMAR SCHOOL\n- Articles of Association of TORQUAY BOYS' GRAMMAR SCHOOL COMPANY\n- IN01 form",
	"company_file_summarized_text": "This document is a Certificate of Incorporation for TORQUAY BOYS' GRAMMAR SCHOOL, a Private Limited Company. The registration date is 1st October 2010, and the company number is 7394671. The registered address is SHIDAY MANOR, TORQUAY, DEVON, TQ2 7EL. The document also includes a Memorandum and Articles of Association, along with an IN01 form.",
	"company_name": "TORQUAY BOYS' GRAMMAR SCHOOL",
	"company_registration_country": "England/Wales",
	"company_registration_number": "7394671",
	"company_trade_name": "TORQUAY BOYS' GRAMMAR SCHOOL"
}

2.0-pro-exp result:

{
	"company_address": "Shiphay Manor Drive, Torquay, Devon, TQ2 7EL",
	"company_file_clarifications": "The company file is composed by several parts, each one of them containing information that overlaps with each other, but at the same time there are pieces of information that are only present in one part and not the others. So, in order to get all the information correctly, all the parts of the document have to be reviewed.",
	"company_file_summarized_text": "The document is the Certificate of Incorporation, Application to Register a Company, and Memorandum of Association of TORQUAY BOYS' GRAMMAR SCHOOL (Company Number: 7394671). Registered on 1st October 2010 in England and Wales, is a private company limited by guarantee.\n\nThe application includes proposed officers as follow: \nSecretary: Mr. Andrew Medhurst.\nDirectors: Brian William Wills-Pope, Christine Weston, and Michael Edward Penfold.\n\nThe memorandum lists Brian William Wills-Pope, Christine Weston and Michael Edward Penfold as subscribers. The Articles of Association outline the company's operational framework, objectives which focus on advancing public benefit education in the UK, and powers. The company's governance structure, member liabilities, and operational guidelines, emphasizing compliance with UK legal and regulatory standards are defined.",
	"company_name": "TORQUAY BOYS' GRAMMAR SCHOOL",
	"company_registration_country": "England and Wales",
	"company_registration_number": "7394671",
	"company_trade_name": "N/A"
}
1 Like

Thanks for flagging the issue, and the comparison between the models is really helpful.
I will escalate the issue to the engineering team.

3 Likes

Had the exact same issue. Decided I’m sticking with Gemini 1.5 Flash till the end of life. Hopefully this will be resolved soon.

1 Like