How to use chat streaming?

I am using the following code for a chatbot to refine user prompts using the Gemini API. I want to use the chat streaming feature to enhance the user experience.

import os
import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

# Configuration and Model setup (same as before)
generation_config = {
    "temperature": 1,
    "top_p": 0.95,
    "top_k": 40,
    "max_output_tokens": 8192,
    "response_mime_type": "text/plain",
}

model = genai.GenerativeModel(
    model_name="gemini-1.5-pro",
    generation_config=generation_config,
)

chat_session = model.start_chat(history=[])

# Initial template (only used for the first prompt)
initial_template = """I am providing you with a prompt idea and need your assistance in refining it into a more detailed and effective version. Please analyze the prompt following these steps: Evaluate, Rewrite, Refine. Do not make assumptions; ask clarifying questions if needed.

Prompt Idea:

{user_prompt}

Instructions:

Evaluate: Assess the prompt for clarity, purpose, and effectiveness. Identify key weaknesses or areas that need improvement.

Rewrite: Improve clarity and effectiveness, ensuring the prompt aligns with its intended goals.

Refine: Make additional tweaks based on the identified weaknesses and areas for improvement.

Final Output:

Optimized Prompt: Present the final optimized prompt using the format: Refined: [Improved Prompt]

Summary of Improvements: Provide a brief bullet point summary of the enhancements made, including the rationale behind each change.
"""

first_prompt = True

while True:
    user_input = input("Enter your prompt idea (or type 'exit' to quit): ")
    if user_input.lower() == "exit":
        break

    if first_prompt:
        formatted_string = initial_template.format(user_prompt=user_input)
        first_prompt = False  # Ensure the template is only used once
    else:
        formatted_string = user_input  # Send subsequent inputs directly

    response = chat_session.send_message(formatted_string, stream=True)
    for chunk in response:
        print(chunk.text)

Hi @Devansh_Awatramani

It seems you’re already using chat streaming by setting stream = True. Could you describe the issue you’re encountering? Take a look at this documentation for reference: link.

Thanks

1 Like

I am using the code provided above but all the text is printed at once and not streamed

You can add below print statement to visually see each chunk text separated from the next.

for chunk in response:
        print(chunk.text)
        print("_" * 80)

Hi @Devansh_Awatramani

The code which is provided in the documentation is working fine. You can find the gist here

Thanks