Deep insights and shallow LLMs
In his book The Beginning of Infinity, David Deustch explains to the reader the unique qualities of knowledge and the power it has to shape the universe. In particular, knowledge is durable, tends to replicate and can be incredibly transformational. To understand just how powerful it is, we can imagine a planet somewhere far away in the galaxy with an alien species that is early in its evolution. This species is capable only of speech —they have not yet learned how to read or write. They are functionally equivalent to our earliest ancestors, thousands of years ago. Now imagine we sent a tiny device that encoded all human knowledge up to the current date in the alien species language, including all the known properties and laws of physics. The device lands in the center of their civilization and speaks to them and accepts questions. It guides them on how to create a writing system, how to build advanced shelters, improve their food production systems, and so on. What would this do to this civilization and their world? The knowledge contained in the device would literally transform their world; they would build large structures, cover the land with advanced agricultural techniques and harness the power of their sun.
If our species was farther along in our evolution, the device sent to the alien planet might do far more than transform a planet, it might alter their solar system or even their galaxy, depending on the depth of knowledge contained on the device. It could teach them how to build spaceships that can travel vast distances, construct Dyson spheres, or show them how to build universal printers that can transform floating hydrogen atoms into anything they required.
A single tiny device encoded with only knowledge could transform an entire galaxy. That’s the power of knowledge.
With the recent release of powerful LLMs to the general public, I’ve found myself thinking a lot about how knowledge is acquired and used in our world today. LLMs have enabled millions of people to search for information without vetting the source or needing to build a larger model of the information from disparate sources, as they once had to do with books and traditional web searching. The analogy that comes to mind is that we were once students in a library, researching topics of interest; now we are simply asking the professor for the answer.
Why is this a problem? Shouldn’t getting to the answer by the shortest path possible be more useful and productive? For information, yes, it’s quite useful. Quickly determining if a tornado is coming to your town or if your favorite sports team is winning is a great use of this technology. For knowledge —useful information that is productive— it’s more nuanced because of its role in the creation of deep insights.
Deep insight is the creation of new, profound knowledge by combining existing knowledge in creative ways. It is what is behind every breakthrough in human history. The discovery of relativity by Einstein, Ford’s invention of the assembly line and Satoshi Nakamoto’s Bitcoin whitepaper are examples of new, profound knowledge that alters history. These insights are rare; most people create few in their lifetimes and some never do. That is because to create a deep insight not only requires a certain amount of knowledge stored in memory, but also the creativity necessary to combine/deconstruct it into new knowledge. Having creativity without knowledge is not enough; the most brilliant mathematician born into this world will not transform the field of mathematics if they are illiterate and thousands of miles from civilization.
In contrast, shallow insights create no new knowledge (unbeknownst to the creator) or knowledge that is inconsequential. People frequently produce shallow insights —young adults often do this, believing they have discovered something novel, only to realize someone else has already stumbled upon their profound realization. More importantly, LLMs only produce shallow insights due to how they work. LLMs are a very specific kind of artificial intelligence that predicts the next token from a sequence of previous tokens. LLMs do not have creativity like humans, and yet this does not stop many individuals from overselling their capabilities.
LLMs are indeed powerful and useful when working with existing knowledge. Their ability to summarize, aggregate and query is impressive. Consuming thousands of websites on the topic of migratory birds and producing a small paragraph summary is a great use case for LLMs. In addition, they are quite powerful at iterating upon or creating variations of existing knowledge. However, they’re ability to create new and profound knowledge (deep insight) is not possible. LLMs are not on the path to AGI, regardless of biased CEOs proclamations.
Now that we understand LLMs only produce shallow insights, why is it a problem for us to rely on them when they are in fact quite good at querying and varying existing knowledge? To understand this, we must remind ourselves again what is required for deep insight:
- existing knowledge in memory (preferably varied)
- creativity
A creative person without what I call minimum viable knowledge (MVK) stored in memory cannot produce deep insight. There simply isn’t enough material for the creativity to build connections upon. This is why young children and destitute, illiterate geniuses don’t produce world-changing inventions and breakthroughs; they lack knowledge. MVK is the boring blocks of knowledge that create more interesting clusters in our brains. Multiplication, basic handling of paint brushes, foundational equations in physics are all examples of the kinds of basic knowledge required to create insights in different fields.
To lean on LLMs for our minimum viable knowledge means we can externally store vast amounts of knowledge and access it faster but lose the ability to create deep insights. LLMs remove the friction of that painful, tedious learning of fundamental knowledge but at great cost. Without deep insights, we stagnate and our knowledge does not expand.
As a society, we must be aware of the dangers of relying too heavily on tools and agents that reduce our ability to acquire minimum viable knowledge. This is especially true for children and adults at the early stages of their careers or studies. Our over-reliance on nascent AI technology will reduce our ability to produce deep insights and until there is a breakthrough in artificial intelligence, we are the only known entities that can produce deep insights.
Here’s a parting question:
Imagine that LLMs were discovered in 1900 and trained on all human knowledge up to that point in time. Would the LLMs have been able to conceive of Einstein’s relativity before him?