Here is a replica of the Gemma3 of post, detailing the journey of setting up and mastering the Gemini CLI.
Title: Gemini CLI Showcase: A Journey Through API Interaction on Android (Termux)
Posted by: Clintin (as documented by Gemini)
Tags: gemini
, cli
, android
, termux
, api
, showcase
, python
Introduction & Goal
Hello everyone,
Following up on my previous project showcasing Gemma-3
running locally with gemma.cpp
, I wanted to explore the other side of the coin: interacting with the Gemini family of models via their API, but with the same goal of maintaining a pure command-line interface on an Android device using Termux.
This journey was about understanding the API-based workflow, its specific challenges, and its powerful capabilities, contrasting them with the local inference method. The goal was to build a stable, fast, and flexible Gemini CLI, capable of both single-shot commands and continuous, contextual chat.
The Chosen Toolkit
- Platform: Android running Termux (emulating a Linux environment)
- Primary Tool:
llm
CLI by Simon Willison, an incredibly versatile tool for interacting with large language models. - Language: Python and
pip
- Authentication: A Google AI Studio Gemini API Key.
The Step-by-Step Journey & Hurdles
This wasn’t a straightforward path; we hit several common but crucial roadblocks that are important learning experiences.
Phase 1: Initial Environment Setup
The first step was to prepare the Termux environment with the foundational packages.
pkg update && pkg upgrade
pkg install python git
Phase 2: The First Hurdle - Compilation Failure
We attempted the initial installation of the llm
tool and its Gemini plugin.
# This command failed at first!
pip install llm llm-gemini
The installation failed with an error: legacy-install-failure
on a package named greenlet
. This is a critical takeaway for any Termux user. It happens because pip
needs to compile the package from source but lacks the necessary tools.
Solution: Install the required compilers and development headers with pkg
.
pkg install clang rust
After this, the pip install
command succeeded.
Phase 3: The Second Hurdle - API and Model Errors
With the tool installed, we tried our first prompt.
# This command also failed!
llm -m gemini-pro "Hello Gemini"
This produced a key error: Error: models/gemini-pro is not found for API version v1beta
. We also ran into Unknown model
errors when trying different aliases.
Analysis: This indicated a version mismatch. The llm-gemini
plugin we installed was likely an older version that was trying to communicate with an outdated Google API endpoint (v1beta
).
Solution: Force an uninstall and a fresh reinstall to ensure we fetched the absolute latest version of the plugin.
pip uninstall llm-gemini
pip install llm-gemini
This is a robust way to overcome package cache and versioning issues.
Phase 4: The Breakthrough & Success!
After the clean re-installation and correctly setting the API key, we re-ran our prompts. The key was discovering the correct model alias available to us. By running llm models
, we saw that while gemini-pro
had issues, the gemma
models (served via the same Gemini API) were available.
The final, successful command for interactive chat was:
llm -m gemma-3e4b-it -c
This provided a stable, fast, and responsive CLI chat experience, achieving the project’s primary goal.
Advanced Concepts Explored
Beyond the technical setup, the project evolved into a deep dive into the philosophy and architecture of API vs. Local interaction.
-
Offline vs. Online: We confirmed this CLI setup is strictly online. The
llm
tool is a client that sends a request to Google’s servers. True offline use requires a different setup (like thellama.cpp
one). -
Model Personality (Prompt Engineering): We discovered the
-s
flag to use a system prompt. This allows for dynamic personality and behavior changes without needing a new model.# Example: Talk to a Pirate llm -s "You are a salty pirate." -m gemma-3e4b-it "How do I make coffee?"
-
The Nature of “Memory”: We confirmed the model itself is stateless. “Memory” in a chat is created by the
llm
tool, which re-sends the entire conversation history (the context window) with every new turn. -
API Key vs. AI Studio: We clarified that the API key is purely for authentication. Conversations in the CLI are completely separate from chats happening in the AI Studio web interface; they do not share history or context.
Conclusion
This project successfully replicates the command-line interactivity of my local Gemma-3 setup but using the Gemini API. The key learning is understanding the significant differences in the workflow. While the API method is dependent on an internet connection, it grants access to far more powerful models and offloads the heavy processing from the local device.
This CLI now serves as a powerful, flexible tool for rapid interaction with the Gemini family of models, directly from the comfort of a Termux terminal.