What are the input/output token limits for Claude Sonnet via the Vertex Model Garden?

In the Vertex Model Garden, Anthropic Claude Sonnet 3.5 is now available: Google Cloud console

What are the input/output token limits? Where do I even find that information?

1 Like

Hi @Ron_Parker,

The official documentation of Claude Sonnet 3.5 is here: Models - Anthropic and they have all rate limits mentioned.

Also Google Doc for using Partner Models: Partner-Models/Claude

Thanks

1 Like

From Welcome to Claude - Anthropic

Max output 8192 tokens1

  1. 8192 output tokens is in beta and requires the header anthropic-beta: max-tokens-3-5-sonnet-2024-07-15. If the header is not specified, the limit is 4096 tokens.

I read somewhere there was a feature request to modify headers in the AnthropicVertex request.

Can I get some sort of follow up from Google?

Actually, I realized that I needed to post the issue here:

In the Google Cloud Platform - Vertex AI GitHub.

But, doesn’t look like anyone responds there.

After extensive online searches, I did find two ways advertised to modify the headers to increase the maximum output tokens:

    client = AnthropicVertex(
        region=LOCATION,
        project_id=project_id,
		# https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#default-headers
		default_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"} # Custom headers here
    )

And also:

x.com

    message = client.messages.create(
        max_tokens=max_tokens,
        messages=[
            {
                "role": "user",
                "content": content,
            }
        ],
        model="claude-3-5-sonnet@20240620",
        # https://x.com/alexalbert__/status/1812921642143900036
		extra_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"}  # Custom headers here
    )

But neither way appears to work in the AnthropicVertex SDK:

Error code: 400 - {‘type’: ‘error’, ‘error’: {‘type’: ‘invalid_request_error’, ‘message’: ‘max_tokens: 8192 > 4096, which is the maximum allowed number of output tokens for claude-3-5-sonnet-20240620’}}