Hi there, I’m struggling to have gemini summarise multiple documents in one payload.
When I submit the following:
{
"contents": [
{
"role": "user",
"parts": [
{
"fileData": {
"fileUri": "https://generativelanguage.googleapis.com/v1beta/files/6yp1nlyt523d",
"mimeType": "application/pdf"
}
},
{
"fileData": {
"fileUri": "https://generativelanguage.googleapis.com/v1beta/files/r9tx3opn6dz",
"mimeType": "application/pdf"
}
}
]
},
{
"role": "user",
"parts": [
{
"text": "what can you tell me about all of these files?"
}
]
}
],
"systemInstruction": {
"role": "user",
"parts": [
{
"text": "always end a response with 'END MESSAGE'"
}
]
},
"generationConfig": {
"temperature": 1,
"topK": 40,
"topP": 0.95,
"maxOutputTokens": 8192,
"responseMimeType": "text/plain"
}
}
Most (70%?) of the time, Gemini will only summarise one of the files supplied, and state that it only has one. The other 30% will work as intended.
An equivalent prompt, using pngs works fine every time.
Could someone point me in the right direction here? Given it works some of the time, it might be a problem with my prompt, or maybe I’m misforming the json payload? (I am dynamically building it in my .NET app). I’m pulling out my hair here.
// Edit: the following prompt that submits a pdf and a html file, only the pdf file is ever recognised. This occurs 100% of the times.
{
"contents": [
{
"role": "user",
"parts": [
{
"fileData": {
"fileUri": "https://generativelanguage.googleapis.com/v1beta/files/wcn44yqwsqqf",
"mimeType": "application/pdf"
}
},
{
"fileData": {
"fileUri": "https://generativelanguage.googleapis.com/v1beta/files/3w0j07r636cn",
"mimeType": "text/html"
}
}
]
},
{
"role": "user",
"parts": [
{
"text": "summarise each of these files and provide a count of all files I have sent you"
}
]
}
],
"systemInstruction": {
"role": "user",
"parts": [
{
"text": "always end a response with 'END MESSAGE'"
}
]
},
"generationConfig": {
"temperature": 0.4,
"topK": 40,
"topP": 0.95,
"maxOutputTokens": 8192,
"responseMimeType": "text/plain"
}
}
Thanks,
Phil.