Experimenting with Gemini 1.5 Pro for Internal Chatbot – Insights & Questions

Hi everyone,

I’m a software engineer working in internal automation at a mid-sized tech company in Vietnam. Recently, I’ve been experimenting with Gemini 1.5 Pro API to build an internal chatbot aimed at helping employees with document lookup, company policy navigation, task automation (like drafting emails or generating brief reports), and more.

Here are a few things I’ve been impressed with so far:

  • The long context window is a game-changer – I can feed in long company SOPs (hundreds of pages) without needing to chunk them.
  • Function calling works reliably – I’ve integrated it with Jira, Google Calendar, and an internal CRM system.
  • The model’s performance in Vietnamese is surprisingly good, especially when prompts are well-crafted.

But I’ve also run into a few challenges:

  • When processing PDF documents with irregular formatting or lots of images, the model sometimes misses key information.
  • Managing conversation flow is tricky when users jump between unrelated topics.
  • Some responses can feel too generic or lacking domain-specific depth, especially in specialized internal workflows.

I’d love to get input from the community:

  1. Any tips for extracting more accurate information from unstructured or messy PDFs? What’s the current best practice for RAG or embedding strategies in this use case?
  2. For internal-facing assistants, how are you handling conversational flow control? Any frameworks or design patterns you recommend?
  3. Has anyone done a comparative evaluation between Gemini Pro vs Claude or GPT-4o specifically in the enterprise chatbot context?

Happy to share a sanitized version of the codebase if there’s interest.

Looking forward to learning from your experiences!

Thanks in advance!

Hello,

Welcome to the Forum!

You may want to try our new models, such as Gemini 2.5 Pro, which offer improved performance in document understanding and deliver more accurate responses overall.