Hello,
I am using EmbeddingGemma for knowledge indexing in Dify, and the indexing process is taking an extremely long time.
I would appreciate any advice on possible causes and how to improve performance.
Here is my setup and situation:
Dify is deployed using Docker (self-hosted)
Embedding model: EmbeddingGemma
Use case: Knowledge indexing (document ingestion / vectorization)
The system works functionally, but indexing speed is much slower than expected
Even relatively small documents take a long time to complete indexing
Things I am wondering about:
Is EmbeddingGemma known to be slower for batch or large-scale embedding tasks?
Are there recommended Docker resource settings (CPU, memory, threads) for EmbeddingGemma?
Does EmbeddingGemma run fully locally, or could there be hidden bottlenecks (e.g., model loading, single-thread execution)?
Are there best practices for chunk size, batch size, or parallelism when using EmbeddingGemma with Dify?
Is this behavior expected compared to other embedding models?
Any guidance, documentation references, or tuning tips would be greatly appreciated.
Thank you in advance.