For those of you that have already looked at and voted for the video I’ve done about Vodo Drive (with a bit of time traveling insight), thank you. For those that haven’t - there’s still time to watch and vote here.
For those of you that want a bit of insight about how I built Vodo Drive, and haven’t seen these yet, my cohost and I have talked about it in two recent episodes of Two Voice Devs:
Episode 206 - Building Powerful AI Agents with LangGraph goes into a little more depth about the agentic loop that is at the core of the system. Built with LangChain.js/LangGraph.js, I show a bit of code about how the tools get called and how decisions are made inside Vodo Drive.
Happy to chat about this with folks and would love to hear your feedback.
I didn’t use Breadboard. In fact, I hadn’t ever heard of it before. But this looks interesting - thanks for pointing it out!
It sounds like it has many of the same goals that LangChain.js does, and I’ve contributed a lot to the LangChain.js library, so I had gone that route.
A few asides, from a quick read of it, and how it would have impacted some of the decisions I made for Vodo Drive:
It doesn’t look like Breadboard supports Vertex AI, but rather the AI Studio API for Gemini.
My choice of Vertex AI for Vodo Drive was deliberate:
API authorization with Firebase Cloud Functions is trivial. (No API Keys! Woo!)
Using Firebase Cloud Storage / Google Cloud Storage makes image processing with Gemini amazingly easy. (No File API! Woo!)
Dovetails well with Google Cloud TTS/STT.
Firebase makes tons of things easier.
The “Google Drive” support is pretty bare-bones, and seems limited to listing and fetching the contents of a file.
What I hope is one of the notable features about Vodo Drive is that you’re editing a cloud-based spreadsheet live. So you can both add data and then ask questions about your spreadsheet dynamically.
Not dealing with API keys would be great, however as an open source project I’m rolling with BYO API Key. If I’d switch over to Firebase I still would not be able to cover the costs, so this would mean BYO Firebase Project. That’s what I faced with the DIY GPS Tracker project, and it’s possible to handle a secondary FB project (a different than the primary), but that opens up another can of worms: which calls won’t work with that rarely used way android - FirebaseMessaging.getInstance(firebaseApp) for secondary app supposed to be public but it's private? - Stack Overflow.
Breadboard features definitely intersect with LangGraph and agentic systems. I’m glad at least that you know about, I also only heard of it from a Googler at Cloud Next, but curreyi don’t work as much in the JavaScript space.
I totally get why you opt for Firebase and Vertex AI
Pretty good job with the app and pushing some JavaScript library features ahead of the Python variant.
I also needed a multilingual embedding in my update to support the Urdu language. However, you cannot use it but Gemini vertexai already supports multiple languages. So you do not need any embedding. If you write text in any language, it will reply in that language. Simple.
Yes, for RAG you need. I just started understanding these concepts. So far, I checked OpenAI libraries and Llama 3 where you need this if you are creating your own Chat LLM for anything but for Gemini, I do not know. I used Vertex AI in my submission and I do not need these.