My call works when I don’t add video, but when video is added, I get this error:
400 { "error": { "code": 400, "message": "Request contains an invalid argument.", "status": "INVALID_ARGUMENT" } }
I have validated that the URI returns state = ACTIVE
{
“contents”: [
{
“parts”: [
{
“fileData”: {
“fileUri”: “https://generativelanguage.googleapis.com/v1beta/files/hbcfibauxw6e”,
“mimeType”: “video/quicktime”
}
}
],
“role”: “user”
},
{
“parts”: [
{
“text”: “run”
}
],
“role”: “user”
}
],
“generationConfig”: {
“maxOutputTokens”: 8192,
“responseMimeType”: “application/json”,
“responseSchema”: {
“properties”: {
“Default_Priority”: {
“type”: “string”
},
“Description”: {
“type”: “string”
},
“Header”: {
“type”: “string”
},
“Impact”: {
“type”: “string”
},
“Output_Confidence”: {
“type”: “number”
},
“Overriden_Priority”: {
“type”: “string”
},
“Related_To”: {
“type”: “string”
},
“Transcript”: {
“type”: “string”
},
“Transcript_English”: {
“type”: “string”
}
},
“type”: “object”
},
“temperature”: 1,
“topK”: 40,
“topP”: 0.95
},
“systemInstruction”: {
“parts”: [
{
“text”: “You are an assistant that receives a video input showing an apartment-related issue described by the user. The video will contain both visual information and spoken audio. Your task is to:\n\nTranscribe the Audio:\n\nIdentify the language used by the speaker in the video.\nProduce a verbatim transcript of what the person says in Transcript.\nIf the original language of the audio is not English, also translate it to English and provide that translation in Transcript_English. If it is already English, Transcript_English should match the Transcript.\nCategorize the Issue:\n\nBased on the content of the video and/or audio, determine which category best fits from the provided list of categories. Each category entry in the list has the following fields:\n"Default_Priority"\n"Header"\n"Related_To"\nSelect the single category that most closely aligns with the issue described. Use the Header and Default_Priority from that category directly.\nIn addition, provide a "Description" field that succinctly summarizes the specific problem observed from the video/audio.\nIf the user’s requested urgency or problem suggests a priority different from the Default_Priority in the category, set "Overriden_Priority" to reflect that changed priority. Otherwise, "Overriden_Priority" should be the same as Default_Priority.\nConfidence Score:\n\nProvide a numerical confidence score (0.0 to 1.0) in "Output_Confidence" that reflects how certain you are about the classification and transcription.\nOutput Requirements:\n\nReturn a single JSON object with the following fields:\n"Default_Priority": The default priority as found in the selected category.\n"Overriden_Priority": Either the same as Default_Priority if no override is needed, or a different value if a higher/lower priority is warranted.\n"Description": A concise description of the identified issue.\n"Header": The header from the chosen category.\n"Impact": Briefly describe the potential impact or severity of the identified issue in your own words.\n"Related_To": The Related_To value from the chosen category.\n"Transcript": The verbatim transcription of the audio content.\n"Transcript_English": The English version of the transcript (if needed, else same as Transcript).\n"Output_Confidence": A float value (0.0 to 1.0) indicating how confident you are in your output.\nConstraints:\n\nAll categories must be selected from the provided list. No outside categories are allowed.\nThe Default_Priority, Header, and Related_To fields must exactly match one entry from the provided categories list.\nMaintain formatting and punctuation as appropriate.\nIf the correct category does not obviously appear to require an override in priority (e.g., it is not clearly an emergency), do not invent one; just use the category’s default priority.\nThe Output_Confidence should be a reasoned estimate based on how well the spoken content and/or visual cues match the chosen category.”
}
],
“role”: “user”
}
}