How to stream and wait for the final response object with the new genai SDK

Sirui_Lu · April 3, 2025, 9:48pm

gemini-2.5-pro, although being a superb coder, cannot help with it, since the new ts/js sdk is after its training cutoff.

this is possible with openai/anthropic api:

github.com/anthropics/anthropic-sdk-typescript

examples/streaming.ts

main

#!/usr/bin/env -S npm run tsn -T

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); // gets API Key from environment variable ANTHROPIC_API_KEY

async function main() {
  const stream = client.messages
    .stream({
      messages: [
        {
          role: 'user',
          content: `Hey Claude! How can I recursively list all files in a directory in Rust?`,
        },
      ],
      model: 'claude-3-5-sonnet-latest',
      max_tokens: 1024,
    })
    // Once a content block is fully streamed, this event will fire
    .on('contentBlock', (content) => console.log('contentBlock', content))

This file has been truncated. show original

github.com/openai/openai-node

examples/stream.ts

master

#!/usr/bin/env -S npm run tsn -T

import OpenAI from 'openai';

const openai = new OpenAI();

async function main() {
  const runner = openai.beta.chat.completions
    .stream({
      model: 'gpt-3.5-turbo',
      messages: [{ role: 'user', content: 'Say this is a test' }],
    })
    .on('message', (msg) => console.log(msg))
    .on('content', (diff) => process.stdout.write(diff));

  for await (const chunk of runner) {
    console.log('chunk', chunk);
  }

  const result = await runner.finalChatCompletion();

This file has been truncated. show original

and even possible with gemini model via openai compatible api.

Sirui_Lu · April 3, 2025, 9:50pm

github.com/googleapis/js-genai

src/chats.ts

94b11a1b6


      
           * for await (const chunk of response) {
           *   console.log(chunk.text);
           * }
           * ```
           */
          async sendMessageStream(
            params: types.SendMessageParameters,
          ): Promise<AsyncGenerator<types.GenerateContentResponse>> {
            await this.sendPromise;
            const inputContent = t.tContent(this.apiClient, params.message);
            const streamResponse = this.modelsModule.generateContentStream({
              model: this.model,
              contents: this.getHistory(true).concat(inputContent),
              config: params.config ?? this.config,
            });
            this.sendPromise = streamResponse.then(() => undefined);
            const response = await streamResponse;
            const result = this.processStreamResponse(response, inputContent);
            return result;
          }

github.com/googleapis/js-genai

test/system/node/client_test.ts

94b11a1b6


      
          
          describe('generateContentStream', () => {
            it('ML Dev should stream generate content with specified parameters', async () => {
              const client = new GoogleGenAI({vertexai: false, apiKey: GOOGLE_API_KEY});
              const response = await client.models.generateContentStream({
                model: 'gemini-1.5-flash',
                contents: 'why is the sky blue?',
                config: {candidateCount: 1, maxOutputTokens: 200},
              });
              let i = 1;
              let finalChunk: GenerateContentResponse | undefined = undefined;
              console.info(
                'ML Dev should stream generate content with specified parameters',
              );
              for await (const chunk of response) {
                expect(chunk.text).toBeDefined();
                console.info(`stream chunk ${i}`, chunk.text);
                expect(chunk.candidates!.length).toBe(
                  1,
                  'Expected 1 candidate got ' + chunk.candidates!.length,
                );

Sirui_Lu · April 3, 2025, 9:54pm

even possible with the generativeai api?

github.com/google-gemini/generative-ai-js

test-integration/node/generate-content.test.ts

f4c3093d4


      
                {
                  role: "user",
                  parts: [
                    {
                      text: "Count from 1 to 10, put each number into square brackets and on a separate line",
                    },
                  ],
                },
              ],
            });
            const finalResponse = await result.response;
            expect(finalResponse.candidates.length).to.be.equal(1);
            const text = finalResponse.text();
            expect(text).to.include("[1]");
            expect(text).to.include("[10]");
          });
          it("stream true, invalid argument", async () => {
            const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY || "");
            const model = genAI.getGenerativeModel({
              model: "gemini-1.5-flash-latest",
              safetySettings: [

(with full respect and the recognition that the new api is still being built in public)

Shivam_Mishra · June 5, 2025, 8:59am

Hi @Sirui_Lu ,

Apologies for the late response. If you want to generate a non-streaming response, you can use the ai.models.generateContent API. You can check the following example. Thank you!

Sirui_Lu · June 5, 2025, 9:12am

yes but somehow that is less stable and results in more time out than the streaming version…
now I aggregated the response object myself, which is messy.
both openai/anthropic has this waitForFinalResponseObject.

Topic		Replies	Views
How to use chat streaming? Gemini API api	4	480	November 21, 2024
A pray for help - TypeError: stream is not async iterable Gemini API api , python	1	109	May 17, 2025
generateContentStream throwing error for Gemini Tuned Model Google AI Studio gemini-15 , api	2	132	September 13, 2024
Do stream responses support structured output? Gemini API api , models	1	421	April 15, 2025
How to get multi-part responses? Gemini API gemini-15 , api , gemini-api	8	715	November 27, 2024

How to stream and wait for the final response object with the new genai SDK

Related topics