Configuring the Gemini SQL Talk app

Hello,

I’m trying to configure the sql-talk-app from the generative-ai github: generative-ai/gemini/function-calling/sql-talk-app/app.py at main · GoogleCloudPlatform/generative-ai · GitHub

I want to require the app to only look in one dataset, but when I try and remove the list_datasets_func part of the app and set the project and dataset IDs for listing tables, the app always tries to get the listed table from the project and dataset: bigquery-public-data:google_analytics_sample

Does anyone know how to stop it from adding those to the table? Thanks

1 Like

Welcome to the forums!

You don’t say how you’re trying to remove the function, but it sounds like you’re leaving other things in place where it needs to provide a dataset ID and, having no way to know which one to use in it’s request, Gemini is hallucinating one.

Two thoughts:

  • Leave it in, but only provide the dataset ID that you want it to use
  • Change list_tables so it doesn’t require the dataset ID.

Thanks for the answer.

I had removed the list_datasets and had started the sql_query_tool at list_tables.

I then set the params for the project_id and dataset_id to be the ones that I wanted to look in. The function would return the right table name and dataset / project IDs but in the get_table function when I gave it the table_id (even if I added the dataset and project IDs to it) it would still hallucinate the source. Should I also add project and dataset IDs as parameters to the get_table function?

Currently I made a fix by adding: Always use the project_id: {PROJECT_ID} and dataset_id: {BIGQUERY_DATASET_ID} to the prompt.

The definition for list_tables includes a parameter for the dataset ID. So it saw that it needed a dataset ID parameter, had no way to request one, so hallucinated one.

As I said above, you may need to change the definition of list_tables so it doesn’t require the dataset ID.

Hi,

I’m still having trouble with this solution. I first had removed the list_datasets_func and I removed the required dataset_id from list tables, and it lists the tables in the correct dataset. However, the then runs the sql_query_func without attempting the get_table_func which would provide the correct table_id and hallucinates the table to query.

Any idea why this would happen? Thanks

I think I’ve fixed it, in the list tables function I just changed the line which only returns table names:
api_response = str([table.table_id for table in api_response])
to:
api_response = str(["{}.{}.{}".format(table.project, table.dataset_id, table.table_id) for table in api_response])
in order to get the full qualified table_id

1 Like