Optimizing Gemini Agent Loops with Deterministic Token Compression (Skillware)

Hi everyone,

I’m working on Skillware, an open-source framework designed to standardize how we package and deploy agentic capabilities across models.

One of the challenges with long-running Gemini agent loops is managing the context window expansion and minimizing token costs perfectly. We just merged a new skill: Prompt Token Rewriter (optimization/prompt_rewriter).

This skill acts as a deterministic heuristic middleware that compresses bloated prompts and history by 50-80% before they ever hit the Gemini API.

Because it’s deterministic (uses regex-based heuristics), it doesn’t add another stochastic LLM call or extra billing to your loop—it just strips the conversational “slop” and redundant structures so Gemini can focus on the signal.

We’ve built native support for the Gemini SDK with a to_gemini_tool adapter, making it easy to plug these modular skills directly into your GenerativeModel tools.

Registry & Integration:
We’re building a community-driven “App Store” for Agentic Skills (Logic + Cognition + Governance). If you’ve been building custom tools for Gemini agents or want to see how we standardise skill delivery, we’d love your feedback or a PR!

Check out the repo and the Gemini implementation guide:

[Repo Link] GitHub - ARPAHLS/skillware: A Python framework for modular, self-contained skill management for machines. · GitHub

1 Like