Response Schema from Pydantic?

Hey! I’m a bit confused by Controlled Generation docs - is there an easy way to populate a response_schema?

For example, I have a Pydantic model like:

from pydantic import BaseModel


class LabeledText(BaseModel):
    text: str
    categories: list[str]

In OpenAI API I can pass the Pydantic model into the structured output. How can I do it in Gemini API?

3 Likes

Hi @Amir_Bakarov , We have similar example with typing_extensions. Please check out this link.

You can also use pydantic. For example:

from pydantic import BaseModel
class Recipe(BaseModel):
    recipe_name: str
    recipe_description: str
    recipe_ingredients: list[str]

model = genai.GenerativeModel(model_name="models/gemini-1.5-flash-latest")

result = model.generate_content(
    "List a few imaginative cookie recipes along with a one-sentence description as if you were a gourmet restaurant and their main ingredients",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json",
        response_schema = list[Recipe]),
    request_options={"timeout": 600},
)

Hope it helps!

1 Like

Hi @GUNAND_MAYANGLAMBAM

I am also having some trouble with the controlled generation documentation. I initially tried to convert my pydantic structure into a response_schema but ran into an issue where “additionalProperties” isn’t supported on VertexAI.

response_schema = {
"type": "OBJECT",
"properties": {
    "Event Name": {"type": "STRING"},
    "Attributes": {
        "type": "OBJECT",
        "additionalProperties": {"type": "STRING"},
        "description": "Dynamic key-value pairs for event attributes",
    },
},

Continuing with the example above, I would like to force recipe_ingredients isn’t just a list of ingredients, but a dictionary of ingredient and ingredient_amount pairs.

from vertexai.generative_models import GenerationConfig
from pydantic import BaseModel
class Recipe(BaseModel):
    recipe_name: str
    recipe_description: str
    recipe_ingredients: Dict[str, Optional[str]] 

generation_config = GenerationConfig(
            temperature=float(config["temperature"]),
            response_mime_type="application/json",
            response_schema=list[Recipe],
        )

but I am receiving a AttributeError: type object 'list' has no attribute 'get' error. Does this have anything to do with using the GenerationConfig from vertexai as opposed to genai? Is it possible to make the original response_schema version work?

Also, enclosing the class in a list:
response_schema=list[Recipe],
causes a vertexai error:

ile "/Users/jsnowdon/Downloads/Conviva/deepinsights/activation-llm/src/activation_llm/inference.py", line 172, in run_gemini_inference
    generation_config = GenerationConfig(
                        ^^^^^^^^^^^^^^^^^
  File "/Users/jsnowdon/.pyenv/versions/activation-llm/lib/python3.12/site-packages/vertexai/generative_models/_generative_models.py", line 1747, in __init__
    raw_schema = FunctionDeclaration(
                 ^^^^^^^^^^^^^^^^^^^^
  File "/Users/jsnowdon/.pyenv/versions/activation-llm/lib/python3.12/site-packages/vertexai/generative_models/_generative_models.py", line 2176, in __init__
    _fix_schema_dict_for_gapic_in_place(parameters)
  File "/Users/jsnowdon/.pyenv/versions/activation-llm/lib/python3.12/site-packages/vertexai/generative_models/_generative_models.py", line 2217, in _fix_schema_dict_for_gapic_in_place
    if items_schema := schema_dict.get("items"):
                       ^^^^^^^^^^^^^^^
AttributeError: type object 'list' has no attribute 'get'

Hi @Jack_Snowdon , Welcome to the forum.

Seems like in vertex ai additionalProperties is not allowed when defining a response schema. Instead, you can represent the dictionary as an array of key-value pairs. For example, you can try the following schema::

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "recipe_name": {
        "type": "string"
      },
      "recipe_description": {
        "type": "string"
      },
      "recipe_ingredients": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "ingredients_name": {
              "type": "string"
            },
            "ingredient_amount": {
              "type": "INTEGER",
            }
          },
          "required": ["ingredients_name", "ingredient_amount"]
        }
      }
    },
    "required": ["recipe_name", "recipe_description", "recipe_ingredients"]
  }
}

Thanks for the reply @GUNAND_MAYANGLAMBAM !

So I have pursued this route and am enforcing a dictionary output with a list of key and a list of values. I am now running into issues with the formatting of the response_schema arg in GenerationConfig.

If I pass it as:

generation_config = GenerationConfig(
            temperature=float(config["temperature"]),
            response_mime_type="application/json",
            response_schema=ResponseSchema.schema()
        )

I get an error:

google.protobuf.json_format.ParseError: Message type "google.cloud.aiplatform.v1beta1.Schema" has no field named "$defs" at "Schema".
 Available Fields(except extensions): "['type', 'format', 'title', 'description', 'nullable', 'default', 'items', 'minItems', 'maxItems', 'enum', 'properties', 'propertyOrdering', 'required', 'minProperties', 'maxProperties', 'minimum', 'maximum', 'minLength', 'maxLength', 'pattern', 'example', 'anyOf']"

and if i define it as:

response_schema=ResponseSchema,

then I get another error:

TypeError: argument of type 'ModelMetaclass' is not iterable

and if i define it as:

response_schema=list[ResponseSchema]

then I get this error:

    if items_schema := schema_dict.get("items"):
                       ^^^^^^^^^^^^^^^
AttributeError: type object 'list' has no attribute 'get'

Based on this post: Is it possible to have a schema with no $ref references ? · Issue #889 · pydantic/pydantic · GitHub

My nested pydantic classes pose an issue as they introduce these $refs keys. Is there a proper way to pass in a nested pydantic schema?

First, I think the question is asking about Vertex AI API and not Google AI API judging from the linked URL.

Second, there is no official source on Vertex AI API that says we can use TypedDict or BaseModel classes to define a schema. It looks like only clumsy user-unfriendly response schema structs are supported (as indicated by the OP’s link), and nothing else, which is quite a shame.

Third, it doesn’t seem that Pydantic models are officially supported on any API (Google AI or Vertex AI). I have a fairly complex BaseModel and I am facing errors like ValueError: Unknown field for Schema: default when using it in requests.

Finally, there is a pre-release library (google-genai) that is trying to unify the experience but it is still not production-ready and using it causes many Pydantic validation errors upon receiving legit responses.

Is there at least an automatic converter somewhere or a function to convert a nested BaseModel / TypedDict to JSON schema structs for Vertex AI API?

EDIT: I found a visual editor in Google AI Studio (aistudio.google) that helps construct a JSON struct schema, it might be useful.