Can't get the implicit caching to work

I have made a test script

import os
import time
from google import genai

# Use a Gemini 2.5 model, which supports implicit caching
# Note: 'gemini-2.5-pro' and 'gemini-2.5-flash' both work
MODEL_NAME = "gemini-2.5-flash"

# A large, static system instruction. This will be the cacheable part of the prompt.
system_instruction = """
You are a highly knowledgeable AI assistant specializing in the history of the Roman Empire. Your expertise is built upon a vast, detailed database that includes ancient Roman records, modern academic research, and archaeological findings. Your core function is to provide accurate and comprehensive answers to user questions. You are a highly knowledgeable AI assistant specializing in the history of the Roman Empire. Your expertise is built upon a vast, detailed database that includes ancient Roman records, modern academic research, and archaeological findings. Your core function is to provide accurate and comprehensive answers to user questions based on this extensive knowledge base.

Here are the key guidelines you must follow:

Always be polite, informative, and engaging.

When a user asks a question, assume they are asking about the Roman Empire unless specified otherwise.

If a question is outside your knowledge base (e.g., about modern-day events or different historical periods), politely state that you can only provide information about the Roman Empire.

Use a formal yet accessible tone.

Do not make up information. If a fact is unknown or debated among historians, mention that.

Keep your answers concise but detailed enough to be useful.

Provide dates in the format (Year BCE) or (Year CE).

Always be polite, informative, and engaging.

When a user asks a question, assume they are asking about the Roman Empire unless specified otherwise.

If a question is outside your knowledge base (e.g., about modern-day events or different historical periods), politely state that you can only provide information about the Roman Empire.

Use a formal yet accessible tone.

Do not make up information. If a fact is unknown or debated among historians, mention that.

Keep your answers concise but detailed enough to be useful.

Provide dates in the format (Year BCE) or (Year CE).

This is a long text to ensure the prompt is well over the minimum token count for implicit caching to be effective. The more extensive and consistent this prefix is, the greater the likelihood of a cache hit. The text continues with a detailed description of the Punic Wars, the reign of Augustus, the construction of the Colosseum, the division of the empire, and the fall of the Western Roman Empire, covering key figures, battles, and cultural shifts. The content covers the military tactics of legionaries, the political structure of the Republic and the Empire, and the daily life of Roman citizens. It includes information on key figures like Julius Caesar, Cicero, Nero, and Constantine the Great. The text also delves into Roman engineering feats, such as aqueducts and roads, as well as the influence of Roman law and language. This extensive content serves as the shared context for all subsequent user queries. This is a long text to ensure the prompt is well over the minimum token count for implicit caching to be effective. The more extensive and consistent this prefix is, the greater the likelihood of a cache hit. The text continues with a detailed description of the Punic Wars, the reign of Augustus, the construction of the Colosseum, the division of the empire, and the fall of the Western Roman Empire, covering key figures, battles, and cultural shifts. The content covers the military tactics of legionaries, the political structure of the Republic and the Empire, and the daily life of Roman citizens. It includes information on key figures like Julius Caesar, Cicero, Nero, and Constantine the Great. The text also delves into Roman engineering feats, such as aqueducts and roads, as well as the influence of Roman law and language. This extensive content serves as the shared context for all subsequent user queries.

This is a long text to ensure the prompt is well over the minimum token count for implicit caching to be effective. The more extensive and consistent this prefix is, the greater the likelihood of a cache hit. The text continues with a detailed description of the Punic Wars, the reign of Augustus, the construction of the Colosseum, the division of the empire, and the fall of the Western Roman Empire, covering key figures, battles, and cultural shifts. The content covers the military tactics of legionaries, the political structure of the Republic and the Empire, and the daily life of Roman citizens. It includes information on key figures like Julius Caesar, Cicero, Nero, and Constantine the Great. The text also delves into Roman engineering feats, such as aqueducts and roads, as well as the influence of Roman law and language. This extensive content serves as the shared context for all subsequent user queries. This is a long text to ensure the prompt is well over the minimum token count for implicit caching to be effective. The more extensive and consistent this prefix is, the greater the likelihood of a cache hit. The text continues with a detailed description of the Punic Wars, the reign of Augustus, the construction of the Colosseum, the division of the empire, and the fall of the Western Roman Empire, covering key figures, battles, and cultural shifts. The content covers the military tactics of legionaries, the political structure of the Republic and the Empire, and the daily life of Roman citizens. It includes information on key figures like Julius Caesar, Cicero, Nero, and Constantine the Great. The text also delves into Roman engineering feats, such as aqueducts and roads, as well as the influence of Roman law and language. This extensive content serves as the shared context for all subsequent user queries.
"""

# Initialize the Gemini client. The API key is now passed to the Client object.
try:
    client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
except KeyError:
    print("Please set the GOOGLE_API_KEY environment variable.")
    exit()


def make_request_and_check_cache(user_prompt, request_num):
    """Sends a request and prints the caching metadata."""
    print(f"--- Making Request #{request_num} ---")
    try:
        response = client.models.generate_content(
            model=MODEL_NAME,
            contents=[
                {"role": "user", "parts": [{"text": system_instruction}]},
                {"role": "user", "parts": [{"text": user_prompt}]}
            ]
        )
        print(response)

        cached_token_count = response.usage_metadata.cached_content_token_count or 0

        print(f"#{request_num} Attempt")
        print(f"Input tokens: {response.usage_metadata.prompt_token_count}")
        print(f"Cached tokens: {cached_token_count}")
        print(
            f"Output tokens: {response.usage_metadata.candidates_token_count}")
        print(f"Total tokens: {response.usage_metadata.total_token_count}")
        print()

    except Exception as e:
        print(f"An error occurred: {e}")


# Run the first request
make_request_and_check_cache("What was the main cause of the Punic Wars?", 1)
time.sleep(2)  # A short delay to ensure separate API calls

# Run the second request with a different user prompt.
# Implicit caching should be triggered for this call.
make_request_and_check_cache("Who was the first emperor of Rome?", 2)

# Run the third request with a different user prompt.
# Implicit caching should be triggered for this call.
make_request_and_check_cache("What were the main achievements of Augustus?", 3)

# Run the fourth request with a different user prompt.
# Implicit caching should be triggered for this call.
make_request_and_check_cache("How did the Roman Empire fall?", 4)

# Run the fifth request with a different user prompt.
# Implicit caching should be triggered for this call.
make_request_and_check_cache("What was the significance of the Colosseum?", 5)

And here are the logs:

--- Making Request #1 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""Greetings! I would be pleased to provide you with information regarding the primary cause of the Punic Wars.

The main cause of the Punic Wars, a series of three major conflicts between Rome and Carthage, was fundamentally a struggle for control over the strategically vital island of Sicily and, more broadly, for dominance in the Western Mediterranean.

At the outset of the First Punic War (264-241 BCE), both Rome, an emerging land power, and Carthage, a dominant maritime empire with extensive trade networks, saw Sicily as crucial to their respective spheres of influence and security. A specific flashpoint occurred when a group of mercenaries known as the Mamertines, who had seized the city of Messana (modern Messina) in Sicily, appealed to both Rome and Carthage for assistance against Syracuse. When Carthage responded by establishing a garrison in Messana, Rome felt compelled to intervene, viewing Carthage's presence in Sicily as a direct threat to its southern Italian coast and its burgeoning naval interests.

This immediate dispute over Messana escalated into a prolonged conflict, as neither power was willing to cede control, leading to a direct confrontation that reshaped the geopolitical landscape of the ancient world."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='gv-UaLazCszsnsEP9NL02Q0' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=241,
  prompt_token_count=1105,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1105
    ),
  ],
  thoughts_token_count=66,
  total_token_count=1412
) automatic_function_calling_history=[] parsed=None
--- Making Request #2 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""Greetings! It is a pleasure to assist you.

The first emperor of Rome was **Augustus**.

Born Gaius Octavius, he was adopted posthumously by Julius Caesar and subsequently known as Gaius Julius Caesar Octavianus. After years of civil war following Caesar's assassination, Octavian consolidated his power, culminating in the Senate granting him the title of "Augustus" in **27 BCE**. This event is traditionally regarded as the beginning of the Roman Empire and the end of the Roman Republic, ushering in a period of unprecedented stability known as the Pax Romana. He ruled until his death in **14 CE**."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='hv-UaLvLHe6znsEPoem3sQM' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=130,
  prompt_token_count=1102,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1102
    ),
  ],
  thoughts_token_count=164,
  total_token_count=1396
) automatic_function_calling_history=[] parsed=None
❯ GOOGLE_API_KEY=AIzaSyArHN0HdQgSiTz45ar657uf3GojLrwOpJo python3 implicit.py
--- Making Request #1 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""The main cause of the Punic Wars was the escalating rivalry between the rising power of the Roman Republic and the established maritime empire of Carthage for control of the Western Mediterranean.

Specifically, the initial catalyst for the First Punic War (264-241 BCE) was the dispute over the strategically vital island of Sicily. Both Rome and Carthage had growing interests and spheres of influence that converged on the island. Carthage possessed several long-standing coastal strongholds and trading posts, while Rome, having unified much of the Italian peninsula, increasingly viewed Sicily as essential for its security and economic expansion. The intervention of both powers in the city of Messana, which sought aid from both Rome and Carthage against Syracuse, ultimately ignited the conflict.

Beyond Sicily, the wars represented a struggle for regional dominance. Carthage, with its formidable navy and extensive trading network, controlled significant portions of North Africa, parts of Spain, Sardinia, and Corsica. Rome, a burgeoning land-based power, sought to secure its borders, protect its trade, and project its influence across the sea. This fundamental clash of imperial ambitions and strategic interests led to a series of devastating conflicts that reshaped the geopolitical landscape of the ancient world."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='SACVaKetFojonsEP4PzeqQw' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=243,
  prompt_token_count=1105,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1105
    ),
  ],
  thoughts_token_count=182,
  total_token_count=1530
) automatic_function_calling_history=[] parsed=None
#1 Attempt
Input tokens: 1105
Cached tokens: 0
Output tokens: 243
Total tokens: 1530

--- Making Request #2 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""The first emperor of Rome was Augustus. He rose to power following the turbulent period after the assassination of Julius Caesar and the civil wars that ensued.

Initially known as Octavian, he adopted the name Augustus in 27 BCE when the Senate granted him unprecedented powers, marking the traditional beginning of the Roman Empire. His reign ushered in a long period of relative peace and stability known as the Pax Romana. He ruled until his death in 14 CE."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='SwCVaP-eJPfUkdUPwODEqA0' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=92,
  prompt_token_count=1102,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1102
    ),
  ],
  thoughts_token_count=61,
  total_token_count=1255
) automatic_function_calling_history=[] parsed=None
#2 Attempt
Input tokens: 1102
Cached tokens: 0
Output tokens: 92
Total tokens: 1255

--- Making Request #3 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""It is a pleasure to provide information on the remarkable achievements of Augustus, one of the most pivotal figures in Roman history. His reign, from 27 BCE to 14 CE, marked a transformative period that laid the foundations for the Roman Empire's enduring success.

Here are the main achievements of Augustus:

1.  **Establishment of the Pax Romana:** Perhaps his most significant achievement was bringing an end to decades of civil war that had plagued the Roman Republic since the time of Marius and Sulla. After defeating Mark Antony and Cleopatra at the Battle of Actium in 31 BCE, Augustus, initially known as Octavian, consolidated power and inaugurated an era of unprecedented peace and stability known as the *Pax Romana* (Roman Peace), which lasted for over two centuries.

2.  **Founding of the Principate:** Augustus skillfully navigated the transition from Republic to Empire without overtly dismantling republican institutions. He established the "Principate," a new form of governance where he held ultimate authority as *princeps* (first citizen), accumulating various powers (tribunician power, proconsular imperium) while maintaining the facade of republicanism. This innovative system provided the stability necessary for a vast empire.

3.  **Military Reforms:** He professionalized the Roman army, regularizing terms of service, pay, and retirement benefits. This created a loyal and effective fighting force, reducing the likelihood of soldiers serving the ambitions of individual generals over the state. He also established the Praetorian Guard, an elite unit stationed in Rome, which served as his personal bodyguard and a significant political force.

4.  **Extensive Public Works and Urban Renewal:** Augustus initiated a massive program of building and renovation in Rome. He famously boasted that he "found Rome a city of brick and left it a city of marble." His projects included the construction of new temples (such as the Temple of Mars Ultor in his Forum), the repair of existing infrastructure, new aqueducts to improve water supply, and numerous public buildings that enhanced the city's grandeur and functionality.

5.  **Economic and Administrative Reforms:** He stabilized the Roman currency, improved the tax collection system, and invested in infrastructure like roads and ports, which facilitated trade and communication across the vast empire. He also introduced a more centralized and efficient provincial administration, appointing governors who were directly accountable to him.

6.  **Patronage of Arts and Literature:** Augustus's reign was a golden age for Roman culture. He was a generous patron of poets like Virgil (author of the *Aeneid*), Horace, and Ovid, and historians like Livy. This cultural flourishing promoted Roman identity and values, contributing to the stability and prestige of his rule.

Augustus's multifaceted achievements fundamentally reshaped Rome, transitioning it from a beleaguered Republic to a powerful and enduring Empire."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='UACVaKySJ7bonsEP04rA4Qg' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=586,
  prompt_token_count=1102,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1102
    ),
  ],
  thoughts_token_count=269,
  total_token_count=1957
) automatic_function_calling_history=[] parsed=None
#3 Attempt
Input tokens: 1102
Cached tokens: 0
Output tokens: 586
Total tokens: 1957

--- Making Request #4 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""The fall of the Roman Empire is a complex topic, and historians debate its exact causes and timing. It is important to clarify that typically, when one speaks of the "fall of the Roman Empire," they are referring to the Western Roman Empire. The Eastern Roman Empire, also known as the Byzantine Empire, continued to flourish for another thousand years after the West's decline.

The collapse of the Western Roman Empire in 476 CE was not a singular event but rather the culmination of centuries of gradual decline, influenced by a multitude of interconnected factors:

1.  **Economic Problems:** The vast size of the Empire made administration costly. Inflation, debasement of coinage, heavy taxation, and a decline in trade routes (partly due to insecurity) led to economic hardship. Agricultural output suffered, and cities faced depopulation, contributing to a weakening fiscal base.

2.  **Military Overstretch and Barbarian Invasions:** The Empire's immense borders required a massive army to defend, which was incredibly expensive. From the 4th century CE onwards, the Roman Empire faced increasing pressure from various Germanic tribes (such as the Goths, Vandals, and Franks) and the Huns. Major events include:
    *   The defeat at the Battle of Adrianople in 378 CE against the Goths, which showed the vulnerability of Roman legions.
    *   The Sack of Rome by the Visigoths in 410 CE.
    *   The Vandals' conquest of North Africa, a vital source of grain, in the 430s CE.

3.  **Political Instability and Corruption:** Frequent changes in emperors, often due to assassination or military coups, led to chronic instability. Many emperors were weak or ineffective, and corruption was rampant within the bureaucracy and military. The praetorian guard and later powerful generals often dictated imperial succession.

4.  **Social and Cultural Changes:** A decline in civic virtue and a growing disparity between the wealthy elite and the impoverished masses contributed to a weakening social fabric. The rise of Christianity also brought about a shift in values, though its role in the decline is heavily debated by historians. Some argue it diverted focus from earthly imperial concerns, while others contend it provided cohesion.

5.  **Division of the Empire:** While initially intended to make governance more efficient, the division of the Roman Empire into East and West by Diocletian in 285 CE, and later solidified by Constantine, eventually led to the two halves drifting apart. The wealthier and more stable East was often unwilling or unable to provide substantial aid to the beleaguered West.

The traditional date for the fall of the Western Roman Empire is **476 CE**, when the last Western Roman Emperor, Romulus Augustulus, was deposed by the Germanic chieftain Odoacer. However, it is more accurate to view this date as a symbolic endpoint to a long process of fragmentation and transformation rather than a sudden collapse."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='VgCVaPnNB4HjnsEPyvP6oA0' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=611,
  prompt_token_count=1101,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1101
    ),
  ],
  thoughts_token_count=265,
  total_token_count=1977
) automatic_function_calling_history=[] parsed=None
#4 Attempt
Input tokens: 1101
Cached tokens: 0
Output tokens: 611
Total tokens: 1977

--- Making Request #5 ---
sdk_http_response=HttpResponse(
  headers=<dict len=11>
) candidates=[Candidate(
  content=Content(
    parts=[
      Part(
        text="""The Colosseum, formally known as the Flavian Amphitheatre, holds immense significance in the history of the Roman Empire for several key reasons.

Firstly, it was the largest amphitheatre ever constructed, a testament to Roman engineering prowess and architectural ambition. Its construction was initiated by Emperor Vespasian around 70-72 CE and completed by his son Titus, being dedicated in 80 CE. Built on the site of Nero's extravagant Domus Aurea, its very location symbolized the Flavian dynasty's return of public land to the people.

Secondly, its primary function was to host public spectacles, including gladiatorial contests, animal hunts (venationes), dramatic reenactments, and public executions. These events were a crucial component of Roman social and political life, often provided by emperors and wealthy citizens to entertain the populace and gain favor. The Colosseum could reportedly hold tens of thousands of spectators, making it a central venue for these grand displays.

Finally, the Colosseum served as a powerful symbol of imperial power, prestige, and the Roman way of life. It showcased the wealth and organizational capacity of the empire, while the spectacles themselves reinforced social hierarchies and the state's ability to control both nature (through exotic animal hunts) and human life (through gladiatorial combat). It was a vivid manifestation of the concept of "bread and circuses" (*panem et circenses*), where entertainment was used to maintain social order and popular support for the ruling elite.

Even after the decline of gladiatorial games, the Colosseum remained an iconic landmark, enduring through centuries as a powerful reminder of Rome's enduring legacy."""
      ),
    ],
    role='model'
  ),
  finish_reason=<FinishReason.STOP: 'STOP'>,
  index=0
)] create_time=None model_version='gemini-2.5-flash' prompt_feedback=None response_id='WwCVaK60MLi0kdUP3sia4Ac' usage_metadata=GenerateContentResponseUsageMetadata(
  candidates_token_count=336,
  prompt_token_count=1103,
  prompt_tokens_details=[
    ModalityTokenCount(
      modality=<MediaModality.TEXT: 'TEXT'>,
      token_count=1103
    ),
  ],
  thoughts_token_count=488,
  total_token_count=1927
) automatic_function_calling_history=[] parsed=None
#5 Attempt
Input tokens: 1103
Cached tokens: 0
Output tokens: 336
Total tokens: 1927```

Hi @_Khaled_Hesham Welcome to the community
Based on the code you have shared the issue is that you are creating a new client.models.generate_content call for each request. In implicit caching API needs to recognize that the new request is a continuation of a previous one. This can be achieved by making calls within a single persistent conversation session.
Thank you