hi,
Here is result of a preliminary comparison of different models our org uses based on a productionized AI data service.
data set used: high quality 100 articles of different topics
prompt used: same prompt is used. Each AI model is asked to identify popular personas among the readers of an article with estimate of proportion of each persona group.
evaluated by: GPT o1 with scores 1-10
Observations:
- Gemma3 text only 1B is very fast on MacBook M1 (32G memory)
- Semantic reasoning, GPT-4o and o3-mini are very close and results are better
- Gemma3 1B is very close to DeepSeek R1 but much faster while running on MacBook M1
- Digital computing seems a weakness of Gemma3 1B.
If we eliminate proportion accuracy, Gemma3 1B is be better than DS R1.
This is a bit surprise to us. I wonder if it’s easy for Gemma3 1B to quickly improve this capability.
Gemma3 1B is the first model with good performance that I can run on my laptop.
The 508 MB size leaves space for fine tuning.
I’d like to thank the team for your outstanding work. I am looking forward to a more robust but not bigger text only model.
Thank you very much!