Large Concept Models and the path to AGI
Sep 2, 2025
I believe we will reach something close to AGI with the combination of large concept models
(LCMs), inference time compute, better memory techniques such as hyena hierarchy, and
deterministic tool calling. Crucially, large concept models break away from the token by
token representation paradigm popularized by OpenAI's models, to enable compute at a
higher level of abstraction. LCMs encode their knowledge in concepts such as 'research' or
'papers' instead of the relatively random and arbitrary 3-letter pseudo-words used my
mainstream LLMs. This approach is closer to the way humans think, helping in the challenge
of model interpretability; while still allowing for the model to use superposition (the ability
for basic units of compute such as tokens or concepts to hold more than one meaning, as
the result of combining concepts such as 'writing', 'research' and 'academia' into the concept
of 'academic writing').
Inference time compute is the term used by labs to describe how they're beginning to allow
models to generate pseudo-answers which the model itself critiques and builds upon to
achieve more complex reasoning than would be achieved by simply creating a token after a
previous token has been created, and it's the technique powering OpenAI's o1 and o3
models, as well as Google's Gemini 2.0 Flash-Thinking. When this relatively new technique is
combined with large concept models, we will see complex reasoning yielding analysis of a
complexity not possible by 'vanilla' LLMs.