Help converting tflite models with mediapipe

Hi,

I am totally new to the AI and ML SKD frameworks, hope I can get some guidance here.

The purpose of converting different models to tflite, is to run a speech to text app on-device for Android (and later also iOS).

Have followed the official MediaPipe documentation and some other websites that have step-by-step instructions and I can’t get to convert the tflite with the recommended settings.

https://ai.google.dev/edge/mediapipe/solutions/guide

https://medium.com/@areebbashir13/running-a-llm-on-device-using-googles-mediapipe-c48c5ad816c6

The script to convert is taken directly from the tutorials

from mediapipe.tasks.python.genai import converter
import os

def gemma_convert_config(backend):
    input_ckpt = '/home/me/gemma-2b-it/'
    vocab_model_file = '/home/me/gemma-2b-it/'
    output_dir = '/home/me/gemma-2b-it/intermediate/'
    output_tflite_file = f'/home/me/gemma-2b-it-{backend}.tflite'
    return converter.ConversionConfig(input_ckpt=input_ckpt, ckpt_format='safetensors', 
model_type='GEMMA_2B', backend=backend, output_dir=output_dir, combine_file_only=False, 
vocab_model_file=vocab_model_file, output_tflite_file=output_tflite_file)


config = gemma_convert_config("cpu")
converter.convert_checkpoint(config)

The tflite with a cpu backend model has always failed throwing a runtime error:

python3.12/site-packages/mediapipe/tasks/python/genai/converter/llm_converter.py", line 220, in combined_weight_bins_to_tflite
    model_ckpt_util.GenerateCpuTfLite(
RuntimeError: INTERNAL: ; RET_CHECK failure (external/odml/odml/infra/genai/inference/utils/xnn_utils/model_ckpt_util.cc:116) tensor

All the options I have tried with different flavors of Ubuntu (wsl2 in Windows 10, a VM with Ubuntu 24) caused the same runtime errors. I was capable of convert with the gpu model, and it actually loads into the android app.

Is there anything I am missing to the the cpu backend to work ? Will the cpu backend convertion will really show obvious performance advantages (as the documentation suggests)?

Any help is welcome.

Welcome, @The_Byte_Cave! I’ve moved your post to the Google AI Edge forum. Hopefully folks here can weigh in.

Hi @The_Byte_Cave Where is that code snippet coming from? I’m asking because I don’t see it in our official code, but I may have missed it. Please ensure you are following these specific instructions: LLM Inference guide for Android  |  Google AI Edge  |  Google AI for Developers . The medium resource you linked is not an official guide so we cannot support code provided by that link. If you are following the link here and are still running into issues please let us know and we will investigate. Thanks.

Edit: For use with the mediapipe task interface please use/download the .bin files found in the link above (from Kaggle) Perhaps this one: Google | Gemma | Kaggle. This is already preconverted so you can just skip the conversion step.