I believe there is a bug in gemini-embedding-2-preview where task_type has no effect on the returned embedding.
I tested these task types:
- RETRIEVAL_DOCUMENT
- RETRIEVAL_QUERY
- CLASSIFICATION
- CLUSTERING
For the same input, the returned vectors are exactly identical across all of them.
I verified this in two ways:
- In a local app storing separate vectors per task type in a vector database
- By directly calling the Gemini embeddings API with the same input and changing only task_type
Observed behavior
For both text input and image input, the embeddings are numerically identical across all tested task types.
Pairwise comparison results were:
- max_abs_diff = 0.0
- cosine similarity = 1.0
Because of that, retrieval rankings and scores are also identical across all task types.
Expected behavior
I expected task_type to affect the embedding output, especially for comparisons like:
- RETRIEVAL_DOCUMENT vs RETRIEVAL_QUERY
- RETRIEVAL_QUERY vs CLASSIFICATION
- RETRIEVAL_QUERY vs CLUSTERING
If task_type is intentionally ignored for gemini-embedding-2-preview, that should be documented clearly. Otherwise this appears to be a backend bug.
Environment
- Model: gemini-embedding-2-preview
- API: Gemini API
- SDK: python-genai
- Output dimensionality: 3072
- Date observed: 2026-03-18 UTC
Example text input
silver lever handle on a round rose
Example image input
A generic JPEG product image loaded from a public URL and embedded with the same bytes for each task_type test.
Minimal reproduction
import asyncio
import os
import httpx
import numpy as np
from google import genai
from google.genai import types
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
TASKS = [
"RETRIEVAL_DOCUMENT",
"RETRIEVAL_QUERY",
"CLASSIFICATION",
"CLUSTERING",
]
def compare_vectors(label, vectors):
print(label)
for i, a in enumerate(TASKS):
for b in TASKS[i + 1:]:
va = vectors[a]
vb = vectors[b]
max_abs_diff = float(np.max(np.abs(va - vb)))
cosine = float(np.dot(va, vb) / (np.linalg.norm(va) * np.linalg.norm(vb)))
print(a, "vs", b, "max_abs_diff=", max_abs_diff, "cosine=", cosine)
async def main():
text = "silver lever handle on a round rose"
text_vectors = {}
for task in TASKS:
result = await client.aio.models.embed_content(
model="gemini-embedding-2-preview",
contents=text,
config=types.EmbedContentConfig(
task_type=task,
output_dimensionality=3072,
),
)
text_vectors[task] = np.array(result.embeddings[0].values, dtype=np.float64)
compare_vectors("TEXT", text_vectors)
image_url = "https://example.com/example.jpg"
async with httpx.AsyncClient(timeout=30.0) as http_client:
response = await http_client.get(image_url)
response.raise_for_status()
image_bytes = response.content
image_vectors = {}
for task in TASKS:
result = await client.aio.models.embed_content(
model="gemini-embedding-2-preview",
contents=[
types.Part.from_bytes(
data=image_bytes,
mime_type="image/jpeg",
)
],
config=types.EmbedContentConfig(
task_type=task,
output_dimensionality=3072,
),
)
image_vectors[task] = np.array(result.embeddings[0].values, dtype=np.float64)
compare_vectors("IMAGE", image_vectors)
asyncio.run(main())
Actual result
For both text and image, every pair of task types produced identical vectors:
- RETRIEVAL_DOCUMENT vs RETRIEVAL_QUERY
- RETRIEVAL_DOCUMENT vs CLASSIFICATION
- RETRIEVAL_DOCUMENT vs CLUSTERING
- RETRIEVAL_QUERY vs CLASSIFICATION
- RETRIEVAL_QUERY vs CLUSTERING
- CLASSIFICATION vs CLUSTERING
Each pair returned:
- max_abs_diff = 0.0
- cosine similarity = 1.0
Question
Can you confirm whether task_type is currently expected to change the output for gemini-embedding-2-preview? If yes, this appears to be a bug. If no, the documentation should clarify that task_type is currently ignored for this model.