Golang sdk takes 10x as much time to respond compared to python sdk for some reason

I am not very good with golang as of now (just started learning it), I have a bot in python and in go and my python bot responds to requests very quickly (in like 0.5s) while with go its at least 4 but up to 7. i have no idea what causes it.

using

Client.aio.models.generate_content() // with gemini 2.5 flash

to get a text response in python, takes less than 1s on avg, while

 genai.Client.Models.GenerateContent() // gemini 2.5 flash as well

takes at least 4s (5s+ in most cases)

what can be causing this?

also, i tried pinging the api using curl in console and it too takes around 4s to get the response.

to clarify: i used them both at the same time, same model, on the same machine, same network settings(theoretically. all i do is run the scripts from console, they should have different network settings, right?), the only difference was the language

python code snippet:

# inside MyClass
# not super relevant part in the constructor function
            contents = [Content(role="user", parts=[Part.from_text(text=prompt)])]
            response = await self.getResponse(model=model_name, contents=contents, config=config)

# relevant part
    async def getResponse(self, model: str, contents: list, config: GenerateContentConfig):
        try:
            return await self.client.aio.models.generate_content(model=model, contents=contents, config=config)
        except Exception as e:
            await klog.err(f"Failed to get response from Gemini: {e}")
            return None

go code snippet:

func (ms *MyStruct) GenerateContent(ctx context.Context, text string, model string) (string, error) {
	var parts []*genai.Part
	parts = append(parts, genai.NewPartFromText(text))
	var contents []*genai.Content
	contents = append(contents, genai.NewContentFromParts(parts, genai.RoleUser))
	config := genai.GenerateContentConfig{
		SafetySettings: safetySettingsBlockNone,
	}

	startTime := time.Now().UnixMilli()

	response, err := ms.Cli.Models.GenerateContent(ctx, model, contents, &config)
	if err != nil {
		return "Error", err
	}

	endTime := time.Now().UnixMilli()
	responseTime := (endTime - startTime) / 1000

	logger.Debugf("Response time: %ds", responseTime)
	return response.Text(), nil
}

Hi @dgl0 The execution time between the two languages is largely consistent, exhibiting minimal variation. Differences in response time are primarily attributable to factors such as the configuration of Gemmini models, the complexity of the prompts, and network-related latency.

as i said, i ran both at the same time from the same machine, so it (theoretically) shouldn’t have been related to network latency, i did not configure any of them past specifying the model name and system prompt.

prompts also were largely the same (system prompt was actually quite a lot longer for the python version, but it was a ton faster still)

Hi @dgl0 sorry for the late response -
If both runs were on the same machine and network latency wasn’t a factor, the difference likely comes down to implementation details in the Go vs Python client libraries. Even if Go is generally faster for raw computation, Python’s AI tooling and SDKs are usually more mature and better optimized, especially for model streaming and token handling. That maturity can make a huge difference in real-world performance

1 Like

SOLVED: i am dumb and was accidentally using flash 2.5 instead of 2.0 in go. 2.0 responds just as quickly in go as in python