Number of parallel or concurrent request for 1.5 pro

How many parallel requests can I safely make to this API call?

from tqdm import tqdm
from concurrent.futures import ThreadPoolExecutor

# Define the processing function
def process_item(i):
    try:
        tagged_sentence = gemini_model_generative(prompt.format(i.tagged_source, i.temp_target))
 
        return tagged_sentence 
    except Exception as e:
        return f"Failed ID {i.id}: {e}"

# Main execution
if __name__ == "__main__":
    with ThreadPoolExecutor(max_workers=5) as executor:  # Adjust max_workers based on your system
        futures = list(tqdm(executor.map(process_item, with_tagged_seg), total=len(with_tagged_seg)))

This depends on the Google GenerativeAI API’s usage quotas.
If you are going with VertexAI + Gemini models, then concurrency limit per region may be 100 to 300 requests/sec and burst rate or short spikes of 50 to 100 requests are safe.
If GeminiPro via AIStudio / Restful-API then:
50-60 requests/min for text-bison or gemini pro in free tier.
So, I would say for Free-tier 2 to 5 workers, paid can go from 10 to 20 and high-throughput vertexAI can be 20 to 80.