AI should be very good at pattern matching. If I have a question or statement and a set of possible answers or responses, I should be able to get a rating of how good the model thinks an answer is. For example, if I have “What color is the sun”, and the options are {1:"red",2:"green",3:"yellow"}
, I would expect the model to return something like “50%, 35%, 80%”, and I would know that the best answer of those is “yellow” with an 80% match. Adding “white” might then register at 87%, or something like that. I don’t want to phrase this as a prompt, just get very basic direct analysis back. Is there an API that would do this?
You can control the output via Structured Output, but there’s no such thing as ‘structured input’. Models work with prompts and context, and you need to provide it with exactly that and set your guardrails accordingly (via system instructions or prompt pre-fill).
You could ask it to infer a rating from your list. You can provide examples of valid input and output as context. And you could ask it for structured output, which can then be validated and sent back to it should the validation fail.
But at the end of the day, you have to understand that models are looking at your data as text, and they are also pretty terrible at counting, which isn’t a great combo for the requirement.
Pattern matching is a much more basic function of an LLM than prompts. When I run speech recognition, the API tells me how good it thinks each word was matched. That’s the part I’m looking for. No fancy prompt, just “which of X is the closest to Y”.
Good luck finding it in the API.
Is there a different type of LLM that would work better for this? I can get structured output to work OK, but it seems like absurd overkill just to say how well one string matches another with some LLM analysis.
You can enable logprobs in the response, which would combine with SO - so you could inspect the logprob values in the output, but Gemini only gives the confidence of the chosen answer, not all options. OpenAI has logprobs + classification tasks, which sounds exactly like what you need. Maybe coming to Gemini soon? Who knows tbh.