How to upload pdfs using File API to gemini api or google AI file manager

Bhargav · January 18, 2025, 3:16pm

Hey there, Currently am facing an issue where I want to take a locally stored file from user, by File API and obtain the File object and directly upload it to google AI file manager, where it can be store for further content generation, but I can’t get this working in Nodejs, as its methods or api don’t accept any file object.
One solution for now is to encode in base64 and then send that string to generate content with user prompt, but this is not recommended for large files above 20MB or so.
What’s the right way to do this? Efficiently and with recommended approach.
[IMPORTANT : I want to upload and use the scanned pdfs particularly, does gemini api not support scanned pdfs? ]
Help would be really appreciated

Bhargav · January 18, 2025, 3:49pm

I tried using upload via rest API (using https://generativelanguage.googleapis.com/v1beta/files/)

it does seem to upload successfully but when I try to provide the uploaded file uri and ask questions or prompts on it, It throws error saying there are no pages in document

Again… Is this because the api can’t recognize scanned pdf or something else
?

jkirstaetter · January 19, 2025, 8:04am

Hi,
You need to check the state of the uploaded file and see whether it is Active or not. Only then you can use it as a file resource as part of your generative requests.

See here Using files | Google AI for Developers and here Explore vision capabilities with the Gemini API | Google AI for Developers for more details.

Cheers

Bhargav · January 19, 2025, 9:49am

Hey thanks for the reply, but it still doesn’t work

Right now am trying to upload files with this api : https://generativelanguage.googleapis.com/upload/v1beta/files

It successfully does get uploaded but when I use rest api for content generation, with this body for e.g :

{
  "contents": [
    {
      "parts": [
        {
          "text": "What is this doc about?"
        },
        {
          "file_data": {
            "mime_type": "application/pdf",
            "file_uri": "https://generativelanguage.googleapis.com/v1beta/files/olixgp5qm1zz"
          }
        }
      ]
    }
  ]
}

I get a error response :

{
  "error": {
    "code": 400,
    "message": "Request contains an invalid argument.",
    "status": "INVALID_ARGUMENT"
  }
}

Why would this be? the file is successfully uploaded with status active, but generate content api throws invalid argument.

jkirstaetter · January 19, 2025, 11:15am

Hi,

here’s the REST payload I produce in my test cases for PDF handling.

{
  "model" : "models/gemini-1.5-pro-latest",
  "contents" : [ {
    "role" : "user",
    "parts" : [ {
      "text" : "Your are a very professional document summarization specialist. Please summarize the given document."
    }, {
      "fileData" : {
        "fileUri" : "https://generativelanguage.googleapis.com/v1beta/files/tflhg5hf78m0",
        "mimeType" : "application/pdf"
      }
    } ]
  } ]
}

This gives me an HTTP 200 OK with the information about the PDF document as requested.
The main aspect might be the missing role key in your request.

BTW, which programming language are you using? It seems that you’re not using an SDK to solve your tasks…

Cheers

Bhargav · January 24, 2025, 3:47pm

Hey sorry for late reply, and thanks for you response, but above snippet or code also doesn’t work for me, I still get the 400 error. Can you please show me how you first upload pdf documents using REST api? Maybe am doing something wrong there

Also, am using javascript, but the sdk doesn’t quite work for me when using streaming content generation in my app.

Thanks!

Bhargav · January 29, 2025, 2:32pm

Hey @jkirstaetter waiting for your response, I can’t get this working, don’t know what’s going wrong, maybe your rest api request for uploading pdfs can help me.
It would be really appreciated.

GUNAND_MAYANGLAMBAM · January 29, 2025, 3:21pm

Hey @Bhargav, this colab gist might help.

jkirstaetter · January 30, 2025, 5:32am

Hi,

To upload to the File API I’m using a multipart request which contains the JSON and the (stream of) binary information of the file. Here’s what my code in .NET looks like.

using var fs = new FileStream(uri, FileMode.Open);
var multipartContent = new MultipartContent("related");
multipartContent.Add(new StringContent(json, Encoding.UTF8, Constants.MediaType));
multipartContent.Add(new StreamContent(fs, (int)Constants.ChunkSize)
{
    Headers = { 
        ContentType = new MediaTypeHeaderValue(mimeType), 
        ContentLength = totalBytes 
    }
});

This is send to the following endpoint: https://generativelanguage.googleapis.com/upload/v1beta/files?alt=json&uploadType=multipart

The JSON part is quite simple containing the display name only.

{
  "file" : {
    "displayName" : "Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context"
  }
}

However, that’s probably nothing you might be able to use though except in .NET.
The Gemini Cookbook repository has samples for Python, JavaScript as well as shell scripting using chunked pieces to upload a file to the File API which might be more interesting for you.

Cheers

Bhargav · January 30, 2025, 3:11pm

Hey, thanks for the response, I’ll try and update you with it

Bhargav · January 31, 2025, 3:21pm

Hey, I finally got it working, your responses were really helpful, thanks! @jkirstaetter and @GUNAND_MAYANGLAMBAM

Topic		Replies	Views
PDF Document Processing returns Bad Request Gemini API gemini-15 , api , models	5	430	September 24, 2024
"Request contains an invalid argument" when use uploaded PDF Gemini API gemini-api , gemini-20	8	438	February 14, 2025
Unable to upload files to Gemini 2.0 : File not exists in Gemini API Gemini API gemini-20	6	292	May 11, 2025
500 error when including a file Gemini API api , model	12	223	September 17, 2024
Sending Files With Prompt: Gemini AI API Gemini API api	11	1429	July 17, 2024

How to upload pdfs using File API to gemini api or google AI file manager

Related topics