Hey, I checked both the vertex ai and gemini ai pricing.
They both mention that 1 second of audio is 32 tokens.
In my app I stream audio in and output text.
the live api pricing per million tokens is $2.10, at the rate described above 1 h should be 115.200 tokens ± $ 0.24 . At least that’s how I calculated (my output is really short @120-150 tokens in size so that’s not even worth calculating )
Basically I had about 1h 30 mins to 2h of live api consumption and it came out as $5.
One thing I’d love to understand is what’s with the modality text on the prompt details ?
how does one calculate the actual cost of the live api ?
{
"serverContent": {
"turnComplete": true
},
"usageMetadata": {
"promptTokenCount": 541,
"responseTokenCount": 156,
"totalTokenCount": 697,
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 410
},
{
"modality": "AUDIO",
"tokenCount": 131
}
],
"responseTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 156
}
]
}
}