Hi @GUNAND_MAYANGLAMBAM and @Vishal
Found another curiousity in the OpenAI compatibility. This time, it’s related to the image generations endpoint. According to the Open AI API reference there are optional parameters and such with default values.
Request
Default values
Here’s the basic REST call with required values only.
POST https://generativelanguage.googleapis.com/v1beta/openai/images/generations
Authorization: Bearer AIza...
Content-Type: application/json; charset=utf-8
{
"prompt": "Photorealistic shot in the style of DSLR camera of the northern lights dancing across the Arctic sky, stars twinkling, snow-covered landscape",
"model": "imagen-3.0-generate-002"
}
Which gives me an HTTP 400 Bad Request with Unsupported response format: \u003cempty\u003e. Supported formats are: b64_json.
Response Format - missing and case sensitive
However, OpenAI says * Must be one of url
or b64_json
. Defaults to url* Which is an actual problem as the Gemini API does not even provide url
as response option.
Setting the ResponseFormat
using upper case gives again HTTP 400 with Unsupported response format: B64_JSON. Supported formats are: b64_json.
OK, perhaps I’m asking for too much despite the Gemini API commonly quite tolerant, ie camelCase vs snake_case, etc. Anyways, that was easy to change adding a lower case conversion to the code.
Quality
Interestingly, this parameter is not case-sensitive.
Given the high similarity of the generated pairs of images I suspect that this property is not applied in the Gemini API (yet).
Size
Also, this parameter is not case-sensitive.
But the discrete values are not applied. The returned size seems to be fixed to 1024x1024
only and not offering the options of OpenAI like Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2. Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models.
Style
And this one… is not case-sensitive.
Given the high similarity of the generated pairs of images I suspect that this property is not applied in the Gemini API (yet).
User
Accepted nicely.
Response
Created
It seems that the current response structure does not provide the created
property. See OpenAI API Reference returning a data
resource. Which seems to be an Image
resource, see next.
RevisedPrompt property?
Is the Gemini API going to provide all properties of the Image
resource? https://platform.openai.com/docs/api-reference/images/object
What’s expected?
Well, for a start it would be pleasant that ResponseFormat
should be case-insensitive, and the missing url
should be provided. Not sure though how those generated images should be made available. Maybe File API or similarly to GeneratedFiles
in Vertex AI?
Supporting the Size
values of OpenAI or at least those provided by Imagen 3.
Complete REST call
For testing and debugging purpose, here’s the complete call.
POST https://generativelanguage.googleapis.com/v1beta/openai/images/generations
Authorization: Bearer AIza...
Content-Type: application/json; charset=utf-8
{
"prompt": "Photorealistic shot in the style of DSLR camera of the northern lights dancing across the Arctic sky, stars twinkling, snow-covered landscape",
"model": "imagen-3.0-generate-002",
"n": 1,
"quality": "STANDARD",
"responseFormat": "b64_json",
"size": "1024x1024",
"style": "NATURAL"
"user": "Mscc.GenerativeAI"
}
I created a test suite with all 20 permutations (2x2x5) and they are all successful. Receiving HTTP 200 OK.
Cheers