Issue with text-embedding-004 Returning Identical Vectors for Specific Languages

I have encountered an issue where the text-embedding-004 model returns identical vectors when inputting text in certain languages.

Please refer to the attached image for more details.

I am accessing text-embedding-004 using an API key issued through Google AI Studio.
google-gemini/cookbook/blob/main/quickstarts/Embeddings.ipynb

It appears that when inputting text in certain languages—presumably those that do not use word segmentation, such as Chinese, Thai, Japanese, etc.—the model now returns the same vector for all inputs.

I am certain that this issue did not exist at least two days ago.
Has anyone else experienced a similar problem or have any insights on this phenomenon?

I would greatly appreciate any advice or information on this matter. Thank you in advance for your help.

Can you post text that illustrates this problem?
The images are good - but having the exact text so we (and Google) can cut and paste it to test this out will go a long way to helping figure out what might be happening.

Thank you for your reply and advice.

I apologize for not including the code earlier.

I used the cookbook from the following URL, and my execution environment is Google Colab:

cookbook/quickstarts/Embeddings.ipynb at main · google-gemini/cookbook · GitHub

The only modification I made was to the list of input strings (content). Below is the code necessary to reproduce the issue:

!pip install -q -U "google-generativeai>=0.7.2"

import google.generativeai as genai

from google.colab import userdata
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

# Different embeddings
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'Hello!',
        'Good evening!',
        'Good morning!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Same embeddings, Japanese
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'こんにちは!',
        'こんばんは!',
        'おはようございます!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Different embeddings, Spanish
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        '¡Hola!',
        '¡Buenas noches!',
        '¡buen día!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Same embeddings, Chinese
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        '你好!',
        '晚安!',
        '早安!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Same embeddings, Thai
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'สวัสดี!',
        'สวัสดีตอนเย็น!',
        'สวัสดีตอนเช้า!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Different embeddings, Vietnamese
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'Xin chào!',
        'Buổi tối vui vẻ!',
        'Chào buổi sáng!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

The embedding model I am using is text-embedding-004, as documented at the following URL:

Gemini models  |  Gemini API  |  Google AI for Developers

Please let me know if you need any clarification.
Thank you in advance for your time and assistance.

Thank you for your reply and advice.

I apologize for not including the code earlier.

I used the cookbook from the following URL, and my execution environment is Google Colab:

cookbook/quickstarts/Embeddings.ipynb at main · google-gemini/cookbook · GitHub

The only modification I made was to the list of input strings (content). Below is the code necessary to reproduce the issue:

!pip install -q -U "google-generativeai>=0.7.2"

import google.generativeai as genai

from google.colab import userdata
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

# Different embeddings
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'Hello!',
        'Good evening!',
        'Good morning!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Same embeddings, Japanese
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'こんにちは!',
        'こんばんは!',
        'おはようございます!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Different embeddings, Spanish
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        '¡Hola!',
        '¡Buenas noches!',
        '¡buen día!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Same embeddings, Chinese
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        '你好!',
        '晚安!',
        '早安!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Same embeddings, Thai

result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'สวัสดี!',
        'สวัสดีตอนเย็น!',
        'สวัสดีตอนเช้า!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

# Different embeddings, Vietnamese
result = genai.embed_content(
    model="models/text-embedding-004",
    content=[
        'Xin chào!',
        'Buổi tối vui vẻ!',
        'Chào buổi sáng!'
    ]
)

for embedding in result['embedding']:
    print(str(embedding)[:50], '... TRIMMED]')

Please let me know if you need any additional information or clarification.
Thank you in advance for your time and assistance.