503 The service is currently unavailable when using Context caching Feature

I’m trying to create a cache by reading the contents of multiple PDF files, but when the total number of tokens within the files exceeds approximately 500,000 tokens, I receive a 503 error (Service Unavailable) from Google API Core.

It seems that the error isn’t returning immediately, but rather after about 40 to 50 seconds. This might indicate that a timeout is occurring in Google API Core.

For more details, please refer to the following GitHub issue.

Yes, I ran into this issue as well. Context Caching currently only works reliably with less than 500k tokens for me personally. I’m using 1.5 Flash.