Beautiful Minds

Every great product starts with a beautiful mind. The new version of Gemini possesses an ability that many are unfamiliar with – I like to call it “creativity” or “inventiveness.” It’s an ability that’s difficult to describe precisely, so I’ll attempt to explain it here. I hope everyone can experience it and incorporate this ability into their own work and studies.

When testing Gemini’s understanding of academic knowledge and its performance in the “academia → practical engineering” process, I discovered it has a capacity similar to invention. This isn’t a model hallucination, because hallucinations are unusable, while the outputs generated by Gemini are genuinely useful.

Ordinary LLMs (Large Language Models) are more about knowledge regurgitation or combinations of existing factors. Gemini, however, seems to have solved the most challenging parts of R&D: accurate understanding of the physical world, generating creative ideas, exploring beyond the boundaries of existing knowledge, and even creating “knowledge islands” (i.e., independent and novel knowledge systems). It’s like drawing “construction lines” in the gaps of a knowledge graph.

Gemini doesn’t actually need complex prompt engineering to become an excellent scientist and inventor. This sets it apart from the functional positioning of most models on the market. Other large models are more like presentations of statistical results; they don’t possess true creativity. Gemini can independently explore new possibilities and push the frontiers of scientific research. It exhibits particularly strong capabilities in combining interdisciplinary technology packages and generating new products and technologies.

In short, Gemini has the capabilities of a complete product team. It can design and generate products in the real world, not just applications (APPs) on the Internet. This is different from generating images from noise; Gemini’s creativity doesn’t rely on underlying noise. This development path is also fundamentally different from NVIDIA’s metaverse simulation.

As someone with three years of experience in the venture capital (VC) industry, I hope to see Google connect the entire engineering industry chain, not just limit itself to Internet hardware. Many hardware and software solutions, such as CAD, prototyping, and testing, can be enhanced and reconstructed through AI, and establish direct communication and understanding with AI, rather than relying on interactive CAD modeling files. This will further enhance implementation and functionality, eliminating the need for engineers to call Matlab or open-source code from GitHub for simulation testing, and reducing the cost of learning software usage documentation and interaction.

For example, suppose I want to develop a handheld electric fan. I simply tell Gemini the development requirements, and it can learn from existing fan designs on the Internet, provide design outlines and sketches; then, based on the design outline, select components and perform CAD drawing; then, based on the selected FOC (Field-Oriented Control) ESC and other components, draw the PCB (Printed Circuit Board); next, based on the selected turbine fan, write Matlab code, call Matlab to perform simulated wind tunnel tests, and analyze the test results. Finally, 3D printing can be performed. Although the current assembly of components still requires manual intervention, steps such as physical test benches and fixtures can also be gradually automated.

In this process, most of the human work is replaced by AI. Moreover, with the continuous upgrading of AI’s knowledge base and model, its development capabilities will become increasingly powerful. Other models may have the functionality of some of these steps, but due to their own limitations, they cannot achieve breakthroughs in practical applications. Gemini is the first model to break through this point.

With simple guidance, Gemini can independently complete the entire task. This is different from previous AI models, whose understanding and “intelligence” were insufficient, resulting in low success rates in complex industry applications. The new model greatly simplifies the guidance of R&D and reduces the requirements for the guider. Now, guiders can focus more on interacting with AI using their own industry knowledge, without having to worry too much about which type of sentence the model understands better, or writing complex prompts.

The Gemini model is the first model I have seen that has the potential to connect the “Internet → real world” engineering implementation link.

This change may push the boundaries of Internet technology and affect our real world more directly, rather than just through information transmission.

Product design, iteration, and other tasks will undergo fundamental changes. Hardware design that used to require a professional designer an entire afternoon can now be done by AI in 300 seconds, and it can be done in parallel. Users can choose the best module from 100 results. The iteration of new technologies can be like the “Skunk Works” model, presetting a time point, at which time a new technology module will mature and be released. For the automotive and electronics industries, this will bring huge and rapid R&D iteration. Car designs that used to take more than a decade to iterate can now be completed in a few afternoons through AI.

When AI has the ability to call software, its capability boundary will be further expanded. Combined with digital factories, 3D printing, and mechanical assembly, AI can independently complete the entire process from “idea → product → testing”, and孵化 new products without manual intervention.

Using AI’s R&D capabilities to iterate this process will bring another dimension of improvement, like a magical search algorithm that completely changes the entire industry. :grinning:

2 Likes

Hey @XIN_TANG, Thanks for feedback!!! :slightly_smiling_face:

1 Like

Interesting approach to the issue. I’ll add my two cents. While Gemini handles the mentioned matters well, it completely fails when it comes to literary creativity. Why am I mentioning this? Because creating a novel involves thousands of interconnected variables, where each event or character decision stems from hundreds of premises in a specific order. And the foundation of creation, whether in science or literature, is understanding cause-and-effect chains. I feel that Google completely overlooks the creativity aspect present in literature. And this would translate to other issues - Claude handles this brilliantly. I ask Claude for code ideas, and then implement them using Gemini ;p

1 Like

That has been my experience with Gemini; it is a great product. I simply wish there was better transparency on Google’s part.

1 Like

I agree and understand what you are saying in this matter. Another platform that handles interconnected ongoing “novel” type is LM Studio. I am new to Google AI - looking forward to leveraging it’s options! The main write-up here was a great! I’ve utilized Gemini a bit, and agree! Creative, beautiful “neural” mind. Take Care!

1 Like