In the Vertex Model Garden, Anthropic Claude Sonnet 3.5 is now available: Google Cloud console
What are the input/output token limits? Where do I even find that information?
In the Vertex Model Garden, Anthropic Claude Sonnet 3.5 is now available: Google Cloud console
What are the input/output token limits? Where do I even find that information?
Hi @Ron_Parker,
The official documentation of Claude Sonnet 3.5 is here: Models - Anthropic and they have all rate limits mentioned.
Also Google Doc for using Partner Models: Partner-Models/Claude
Thanks
From Welcome to Claude - Anthropic
Max output 8192 tokens1
anthropic-beta: max-tokens-3-5-sonnet-2024-07-15
. If the header is not specified, the limit is 4096 tokens.I read somewhere there was a feature request to modify headers in the AnthropicVertex request.
Can I get some sort of follow up from Google?
Actually, I realized that I needed to post the issue here:
In the Google Cloud Platform - Vertex AI GitHub.
But, doesn’t look like anyone responds there.
After extensive online searches, I did find two ways advertised to modify the headers to increase the maximum output tokens:
client = AnthropicVertex(
region=LOCATION,
project_id=project_id,
# https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#default-headers
default_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"} # Custom headers here
)
And also:
message = client.messages.create(
max_tokens=max_tokens,
messages=[
{
"role": "user",
"content": content,
}
],
model="claude-3-5-sonnet@20240620",
# https://x.com/alexalbert__/status/1812921642143900036
extra_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"} # Custom headers here
)
But neither way appears to work in the AnthropicVertex SDK:
Error code: 400 - {‘type’: ‘error’, ‘error’: {‘type’: ‘invalid_request_error’, ‘message’: ‘max_tokens: 8192 > 4096, which is the maximum allowed number of output tokens for claude-3-5-sonnet-20240620’}}