Configuring the Gemini SQL Talk app

Hello,

I’m trying to configure the sql-talk-app from the generative-ai github: generative-ai/gemini/function-calling/sql-talk-app/app.py at main · GoogleCloudPlatform/generative-ai · GitHub

I want to require the app to only look in one dataset, but when I try and remove the list_datasets_func part of the app and set the project and dataset IDs for listing tables, the app always tries to get the listed table from the project and dataset: bigquery-public-data:google_analytics_sample

Does anyone know how to stop it from adding those to the table? Thanks

Welcome to the forums!

You don’t say how you’re trying to remove the function, but it sounds like you’re leaving other things in place where it needs to provide a dataset ID and, having no way to know which one to use in it’s request, Gemini is hallucinating one.

Two thoughts:

  • Leave it in, but only provide the dataset ID that you want it to use
  • Change list_tables so it doesn’t require the dataset ID.

Thanks for the answer.

I had removed the list_datasets and had started the sql_query_tool at list_tables.

I then set the params for the project_id and dataset_id to be the ones that I wanted to look in. The function would return the right table name and dataset / project IDs but in the get_table function when I gave it the table_id (even if I added the dataset and project IDs to it) it would still hallucinate the source. Should I also add project and dataset IDs as parameters to the get_table function?

Currently I made a fix by adding: Always use the project_id: {PROJECT_ID} and dataset_id: {BIGQUERY_DATASET_ID} to the prompt.

The definition for list_tables includes a parameter for the dataset ID. So it saw that it needed a dataset ID parameter, had no way to request one, so hallucinated one.

As I said above, you may need to change the definition of list_tables so it doesn’t require the dataset ID.

Hi,

I’m still having trouble with this solution. I first had removed the list_datasets_func and I removed the required dataset_id from list tables, and it lists the tables in the correct dataset. However, the then runs the sql_query_func without attempting the get_table_func which would provide the correct table_id and hallucinates the table to query.

Any idea why this would happen? Thanks

I think I’ve fixed it, in the list tables function I just changed the line which only returns table names:
api_response = str([table.table_id for table in api_response])
to:
api_response = str(["{}.{}.{}".format(table.project, table.dataset_id, table.table_id) for table in api_response])
in order to get the full qualified table_id

1 Like