503s on 2.5 flash

Joel_Ownby1 · October 15, 2025, 2:03pm

Crashing my applications on 2.5 flash. These 503s have gotten consistently worse. What is the deal? I am seriously considering switching models. All morning so far. Can’t test new rollout losing tons of money on devs sitting around. Par for the course Google.

Joel_Ownby1 · October 15, 2025, 2:17pm

this is the solve I am using on these 503s in production if it helps anyone:

Invocation #1:
- _validate_video_quality is called.
- The Gemini API is overloaded.
- The function waits 2.5 minutes, then raises a RetryError.
- This error is not caught inside the function. It propagates up to the main lambda_handler.
- The lambda_handler’s main try…except block catches the error, logs it, and then re-raises it.
- Because the handler exits with an unhandled exception, the entire Lambda invocation is marked as FAILED.
- Crucially, no subsequent Gemini calls are made in this run.
AWS/S3 Takes Over:
- The S3 event trigger sees that the Lambda invocation failed. It considers the event unprocessed.
- S3 has a built-in retry policy. It will wait for a period (e.g., 1 minute) and then re-invoke the entire orchestrator function from the beginning with the exact same S3 event.
Invocation #2 (The Retry):
- The orchestrator starts over. It downloads the file, uploads it, checks the status, and calls _validate_video_quality again.
- By now, the transient overload on the Gemini API may have resolved.
- If the call now succeeds, the orchestrator proceeds as normal to the subsequent calls (PASS_0_TRIAGE…), which will now also likely succeed.
- If the API is still overloaded, this entire retry process will happen again (up to the S3 trigger’s limit, usually 2 retries).

def _validate_video_quality(model, gemini_file) -> dict:
    logger.info("--- Running Tier 0: Pre-Analysis Sanity Check ---")
    try:
        # Define a custom retry strategy with a shorter total deadline to prevent Lambda timeouts.
        custom_retry_policy = retry.Retry(deadline=150) # Total timeout of 150 seconds for all retries.

        response = model.generate_content(
            [gemini_file, PASS_0_0_PRE_ANALYSIS_SANITY_CHECK_PROMPT],
            request_options={"timeout": 120, "retry": custom_retry_policy}
        )
        data = _get_json_from_gemini_response(response, "Pre-Analysis Sanity Check")
        assessment = data.get('video_quality_assessment')
        
        # Check for a well-formed response from the AI model.
        if not assessment or 'is_analyzable' not in assessment:
             raise ValueError("Malformed response from Pre-Analysis Sanity Check. AI output did not match expected schema.")
        
        logger.info(f"Pre-analysis check complete. Is analyzable: {assessment.get('is_analyzable')}.")
        return assessment

    # --- FIX START: More specific exception handling ---
    # Only catch ValueErrors (e.g., malformed JSON) which are truly fatal for this step.
    # Let RetryErrors and other transient API errors propagate up to the main handler.
    # This allows the Lambda invocation to fail and trigger an automatic retry from S3.
    except ValueError as e:
        logger.error(f"Pre-Analysis Sanity Check failed due to a data or schema error: {e}", exc_info=True)
        return {
            "is_analyzable": False,
            "reason": f"The AI model returned a malformed response during the initial quality check: {e}"
        }

Topic		Replies	Views
Gemini 2.5 pro 503 error Gemini API ai-studio , api , gemini , model , gemini-2-5	2	387	September 10, 2025
Frequent 503 "The model is overloaded" errors on Gemini 2.5 Flash Gemini API model , gemini-flash-2-5	10	520	October 13, 2025
I'm paid, but 503 Error: The model is overloaded. Please try again later Gemini API api , models , gemini_25_pro	4	313	September 22, 2025
ALL of The Gemini Models Are giving me 503 Error Gemini API ai-studio , api , models	10	512	September 29, 2025
Gemini API latency Issues Gemini API bug , api , issues	6	301	September 23, 2025

503s on 2.5 flash

Related topics