Deep Research Agent API Questions for Rate Limitsand Costs

i just got the email about Deep Research API access and started experimenting with it. The documentation is helpful for basic usage, but I’m trying to understand a few things for production use:

  1. Concurrent interactions - Can I run multiple background=True research tasks simultaneously? If so, is there a limit per API key?

  2. Interaction storage/retention - The docs mention store=True is required for background execution. How long are completed interactions retained? Can I retrieve results hours/days later?

  3. Cost estimation - For a typical research task that takes 5-10 minutes, roughly how many tokens are consumed? The pricing page shows per-token costs, but it’s hard to estimate when the agent is autonomously searching and reading.

Has anyone built production workflows with this yet?
Curious about real-world usage patterns and costs.

Man I can’t even get it to give me access… I’m on Paid Tier 1 – but get 404 when using deep-research-pro-preview-12-2025

I’m using it but running into a few issues:

  • when streaming responses it is batching thinking responses in groups of 2-3 every 15s instead of returning each individually
  • The responses (including the completion event) never contains usage metadata

I haven’t used preview APIs from Google before so I don’t know if these are typical early issues or not. My code is very much the code from the Google examples so I don’t think the problem is in my code but I have written enough code in my time to know that it could always have issues.

Thanks folks for the feedback, I’m the Product Manager for this feature, sorry about the delay.

Concurrent interactions - Can I run multiple background=True research tasks simultaneously? If so, is there a limit per API key

You can run multiple background=True requests simultaenously.

The docs mention store=True is required for background execution. How long are completed interactions retained? Can I retrieve results hours/days later?

For free tier, it’s retained for 1 day. For paid, 55 days. These are documented under Interactions API guide in docs: https://ai.google.dev/gemini-api/docs/interactions

For a typical research task that takes 5-10 minutes, roughly how many tokens are consumed? The pricing page shows per-token costs, but it’s hard to estimate when the agent is autonomously searching and reading.

It depends, but I’ll try to get some estimate and document it to give more guidance.

Man I can’t even get it to give me access… I’m on Paid Tier 1 – but get 404 when using deep-research-pro-preview-12-2025

This is only available on Interactions API, are you sure you’re using the right API?

If so, can you share code for reproduction?

when streaming responses it is batching thinking responses in groups of 2-3 every 15s instead of returning each individually

I will investigate this further. Meanwhile, do you have sample code + output we can look at?

The responses (including the completion event) never contains usage metadata

Is this during streaming or do you run into this problem when running without streaming as well?

(Regardless, seems like a bug, will investigate / fix, but would be helpful to understand more)

I’ll keep monioring this thread, but also feel free to reach out to me directly at alicevik@google.com

1 Like

Thanks for the reply and thanks to your team for a really exciting new product.

My code is basically the code from the sample on the documentation page:

To start the interaction:

const stream = await client.interactions.create({
  input: query,
  agent: 'deep-research-pro-preview-12-2025',
  background: true,
  stream: true,
  agent_config: {
    type: 'deep-research',
    thinking_summaries: 'auto'
  }
});

The looping through the events:

for await (const chunk of stream) {
  console.log('[Stream] Received chunk at ' + new Date().toISOString());
  const event = chunk as any;
  eventCount++;

  console.log(`[Event ${eventCount}] ${event.event_type}`);

  // Capture interaction ID
  if (event.event_type === 'interaction.start') {
  ...

The missing usage metadata is both in the streaming chunks and also if I do an interactions.get() after the stream is complete as well. I have tried both without success.

Thank you. We’ll treat it as high priority bug and aim to roll out a fix as soon as holidays are over.

Thanks, I appreciate the feedback and focus.

This is now fixed – usage field should be both populated for Interaction object and interaction.complete event.