Payload Size Limit Error with embed_content API

diego_mattozo · October 17, 2024, 5:48pm

I am currently using the Google AI Python SDK to generate embeddings for some markdown with the following code:

model = "models/embedding-001"
embedding = genai.embed_content(
            model=model,
            content=text,
            task_type="retrieval_document",
        )

However, I am encountering the following error:

InvalidArgument: 400 Request payload size exceeds the limit: 10000 bytes.

I estimated the size of the text to be 4945 tokens using the following code:

model = genai.GenerativeModel("models/gemini-1.5-flash")
print(model.count_tokens(text))

Is it alright? The token size does not appear to be so big so that i need to create chunks for my markdown. How can i resolve this?

I would appreciate any insights or guidance on how to handle this issue or whether the limit can be adjusted in some way.

Thank you in advance for your support!

afirstenberg · October 17, 2024, 7:58pm

As the error message says - the limit is 10000 bytes. Not tokens.

tocsa · October 18, 2024, 11:41am

Where is this code from? I’m asking because I thought there is no such model as embedding-001. Or if there’s such then it’s very old.
See Get text embeddings | Generative AI on Vertex AI | Google Cloud

I see textembedding-gecko@001. I’d try a newer model, like the suggested 004 or multilingual-002. Newer models gave lower dimensionality as well.

Conceptual thinking: as embedding goes you want to grasp well rounded concepts and not a bunch of mix of concepts when trying to index into that high dimensional latent embedding space. This is why RAG frameworks do chunking and perform the embedding for each chunk hoping the chunks are more whole concepts instead of a mix of some. 5k tokens seem way too big for an ideal chunk size. It’s usually around 150 characters or 80-100 tokens, possibly even less. So consider that when architecting your generative AI pipeline.

sps · October 18, 2024, 12:53pm

Welcome to the forum @diego_mattozo

The embeddings-001 model has an input token limit of 2048 tokens.

diego_mattozo · October 18, 2024, 1:27pm

Hi, thanks for your response. I was following this tutorial. I just thought that the api limitation of 10k bytes (~1500 words) was a little weird. Shouldn’t it just truncate my text or give me the possibility to do so? For my poc i think its ok, i didnt want to deal with chunking yet. After i changed my code to use another sdk, it worked (with truncation):

from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel
texts = [...]
model = TextEmbeddingModel.from_pretrained(
            "textembedding-gecko@001"
        )
inputs = [TextEmbeddingInput(text, task_type=task_type) for text in texts]
        embeddings = model.get_embeddings(inputs)

OnerYalcin · January 9, 2025, 4:57pm

Hit this error also, in my case it rarely happened but I added a manula truncation with the follwing piece of code:

def _truncate_text(text: str, max_bytes=9900) -> str:
    """Google embedding model do not accept text longer than 10000 bytes so we need to truncate the text
    before embedding

    Args:
        text: Text to truncate
        max_bytes: Maximum number of bytes to allow. By default, 9900 bytes
    """

    encoded = text.encode('utf-8')
    if len(encoded) <= max_bytes:
        return text

    logger.warning(f"Truncating text from {len(encoded)} bytes to {max_bytes} bytes")

    truncated = encoded[:max_bytes]
    # Decode and re-encode to ensure we don't cut in the middle of a multi-byte character
    return truncated.decode('utf-8', 'ignore').encode('utf-8')[:max_bytes].decode('utf-8')

Hope this helps.

Topic		Replies	Views
Error generating embeddings using text-embedding-005 Gemini API model , gemini-embedding	2	49	May 13, 2025
Token limit error (1.5 Pro and Flash) Gemini API gemini-15 , models	6	1202	March 25, 2025
models.generateContent files with 500 Gemini API bug	2	82	July 23, 2024
Request payload size exceeds the limit Gemini API	1	538	May 31, 2024
Truncated Response Issue with Gemini 2.5 Flash Preview Gemini API bug , gemini-flash	5	212	May 20, 2025

Payload Size Limit Error with embed_content API

Related topics