AI edge torch api converted Gemma 2b inference via Mediapipe on Android

Chinhao_Chang · June 26, 2024, 8:29am

Hi,

We have tried the steps to convert Gemma 2B into TFLite format. We succeed in the transformation (including tokenizer.model incorporation) and can also perform inference through Mediapipe on Android phone, but it outputs garbled characters.

This seems related to the Mediapipe SDK for such AI edge torch converted models since it outputs normal results on this TFLite model over our server. And we get the same result when do it on phi-3-mini.

Further, the AI edge torch API supports quantization to int8, are there any plans to extent this to support quantization to int4 in the near future?

Thanks

Topic		Replies	Views
Help converting tflite models with mediapipe Google AI Edge tflite-support , mediapipe	2	229	January 22, 2025
Problem with quantization of GRU model General Discussion models , tflite , help_request	1	1146	August 23, 2024
TFLITE to mediapipe Keras mediapipe , tflite	1	119	June 26, 2024
TFlite converter：why the performance get worse after calibration General Discussion tflite	1	495	August 23, 2024
QAT training: convert input/output to 8 bits instead of float32 General Discussion models , keras , tflite	7	1417	May 16, 2023

AI edge torch api converted Gemma 2b inference via Mediapipe on Android

Related topics