Why can't Gemini draw graphs?

I was wondering why Gemini can’t graph things visually? Like, can’t we make it do things visually?

Welcome to the forum.

Large language models in general aren’t that good at handling non-linear information anyway; put another way, if you feed a model the actual definition of a concrete graph, a vertex set and an edge set and then ask it simple questions like is this graph connected you get random guesses as a response. Ask something more complicated like is this graph planar and you get a different set of random guesses, the model will claim to have found a K5 or a K3;3 where there isn’t one, for example.

Models are very good at translating from one to a different language, however. So, if you have data that can be translated into a graph drawing language, Gemini does that well enough. Using the GraphViz language (dot) I have tested — works well enough, Gemini might lose focus occasionally and a vertex that was supposed to be a box gets turned into a box with rounded corners, or a color might be wrong here and there, but it’s mostly good enough.

The other popular graph drawing language (Mermaid) I haven’t tested with Gemini yet, Claude 3.5 Sonnet models use it quite effectively for visualization artifacts so I think Gemini will handle that too.

You get the effect “Gemini drew a graph for me” by giving the model a tool function to call when it has generated the text of the graph using graph language. The model will then call the tool function and your client can render the drawing passed as a string argument to the tool call. Tested that this works as well.

Hope this helps.

1 Like

Welcome to the forums!

I want to echo what @OrangiaNebula said - remember, Large Language Models like Gemini are good at language patterns. Not visual patterns. And right now, almost ever LLM is text-only out.

And, as noted, it is good at transforming one thing into another. Such as transforming “fuzzy human” thinking into more “discrete digital” representations. So it might be good to do things like:

  • Turn human read coordinates into a JSON data structure of those coordinates
  • Turn human graph descriptions into Google Sheets graphing parameters
  • Turn human instructions into a python function that draws a graph

But each of these still require some prompt work to get setup correctly.

You may be asking “If LLMs are mostly text out, then how does ChatGPT or Gemini Chat show images and graphs?” Both of these systems actually use multiple models, API calls, and tools in the background to do their tasks. So if you ask for a chart, for example, what actually happens behind the scenes is that they ask the LLM to write a program to generate the chart, and then it runs that program and returns the output.

You can employ a similar scheme yourself.