Beyond Words: Can AI Think in Concepts?
I’ve been thinking a lot about artificial intelligence lately, particularly about how we might push it beyond its current limitations. One of the most interesting developments I’ve come across is what researchers are calling Large Concept Models, or LCMs.
The idea is deceptively simple, yet potentially transformative: instead of having AI models work primarily with words, what if we could have them work with concepts directly? This isn’t just an incremental improvement - it’s a fundamentally different approach to how we think about artificial intelligence.
The Limitations of Large Language Models (LLMs)
Let me tell you what’s wrong with LLMs. Yes, they’re impressive. Yes, they power everything from chatbots to search engines. But they’re fundamentally limited in a way that most people don’t realize.
The core problem is that LLMs don’t actually understand anything. They’re pattern matching machines that have gotten really good at stringing words together in ways that sound convincing. But there’s no real comprehension happening under the hood.
Here’s a way to think about it: When you read an essay, you don’t process it word by word like a computer. Your brain automatically extracts the key ideas and builds a mental model of how they fit together. LLMs can’t do this. They lack what I call “conceptual reasoning” - the ability to work with ideas rather than just words. And until we solve this problem, they’ll remain sophisticated mimics rather than true thinking machines.
Enter Large Concept Models (LCMs)
Here’s what’s interesting about LCMs: they work with concepts instead of words. This may seem obvious in retrospect, but it’s actually quite profound. When you think about something, you don’t think in words. You think in concepts. Words are just how we express those concepts to others.
I’ve noticed this myself when writing essays. The hard part isn’t finding the right words - it’s getting the ideas straight in your head. Once you have the concepts clear, the words tend to follow naturally. This is what LCMs are trying to replicate. They operate at the concept level, which is closer to how humans actually think.
What makes LCMs particularly clever is that they don’t completely abandon words. They use them as a foundation, building on top of existing tools like Sonar. This feels like the right approach - you want to preserve what works while pushing into new territory.
Sonar: A Universal Translator for Concepts
Here’s something interesting: Sonar takes sentences and turns them into vectors - essentially mathematical points in space that capture their meaning. What’s clever about this is that it works across over 200 languages. Most AI systems are built primarily for English, but Sonar doesn’t care what language you use. It’s like having a universal concept dictionary.
I’ve been thinking about this a lot lately. The remarkable thing about Sonar isn’t just that it works with many languages - it’s that it reduces language itself to pure meaning. When you say “the sky is blue” in English or Japanese or Arabic, you’re pointing to the same concept. Sonar captures that concept directly. This seems obvious once you think about it, but it’s actually quite profound.
How Do LCMs Work?
The way LCMs work is actually pretty simple, though not in the way most people would expect. When you feed it text, it first breaks everything down into individual sentences. Then something interesting happens: each sentence gets converted into what’s essentially a mathematical point in space using this thing called Sonar.
This conversion step is the key insight. Instead of juggling words, which is what most AI does, the LCM is now working with pure meaning. It’s like the difference between trying to understand a painting by looking at individual brush strokes versus stepping back to see the whole image.
Once everything’s converted, the LCM does its magic on these mathematical points, creating new ones. And here’s the clever part: these new points can become sentences in any language. It doesn’t matter if it’s English, Chinese, or some language the model has never seen - as long as it represents a coherent concept, it works.
What makes this particularly interesting is that the model isn’t really doing translation in the traditional sense. It’s operating at a more fundamental level, dealing with raw concepts rather than words. This turns out to be a surprisingly powerful approach.
Building and Training LCMs
I’ve been looking at how researchers are building these LCMs. It’s fascinating stuff. The simplest approach, what they call the Base LCM, just tries to predict what concept comes next in a sequence. Think of it like trying to guess the next word in a sentence, except with concepts instead of words.
Then there’s this clever hack where they borrow techniques from image generation. The diffusion-based approach lets the model capture subtle variations in meaning that the simpler approach might miss. It reminds me of how painters work with layers of color to achieve just the right effect.
The third approach, which they call quantization, is particularly interesting. Instead of working with continuous concepts that can smoothly blend into each other, they break things down into distinct chunks. It’s like the difference between analog and digital. Sometimes having clear boundaries makes things easier to work with.
But here’s the thing that really caught my attention: these models are incredibly sensitive. Change one tiny thing in the concept space, and the meaning can shift dramatically when you convert it back to text. It’s like trying to balance a pencil on its point - the slightest movement and everything changes.
Despite this fragility (or maybe because of it - sometimes constraints lead to interesting solutions), they’ve managed to scale one of these models up to 7 billion parameters. And it works surprisingly well, especially at summarizing things. That’s pretty remarkable when you think about it.
The Future of LCMs: Planning and Beyond
Here’s something interesting: the researchers want to add explicit planning capabilities. This matters because generating long-form text isn’t just about stringing concepts together - you need a strategy.
I’ve been thinking about this problem a lot. The proposed Large Planning Concept Model (LPCM) feels like the right direction. It’s obvious in retrospect, but planning is what separates random musings from coherent essays. The best writers don’t just write - they plan.
Evaluation and Performance
The evaluation was pretty straightforward. They looked at three things:
- Could the model predict what comes next?
- Did the output make sense?
- Was it actually generating new text instead of copying? The results were interesting. The diffusion-based models, particularly the two-tower variant, worked best. This wasn’t too surprising - diffusion has proven to be a powerful technique in other domains.
What caught my attention was the 7B parameter model’s performance on summarization. It didn’t just copy chunks of text like most models do. Instead, it actually rewrote things in its own words. This is closer to how humans summarize - we don’t just highlight and copy paste, we process and rephrase.
The model struggled a bit with expanding summaries into longer texts. This makes sense when you think about it. Compression is easier than expansion - there are many ways to say the same thing, but only a few ways to distill something down to its essence.
Perhaps the most surprising result was how well it handled different languages. Despite only training on English, it outperformed specialized multilingual models on most of the 45 languages tested. This suggests it’s learning something fundamental about how concepts fit together, rather than just memorizing patterns in English.
Efficiency and the Fragility of Sonar
Here’s something counterintuitive: LCMs are actually more efficient than traditional language models. Not because they’re doing less work, but because they operate on a more compressed representation. When you work with concepts instead of raw text, you naturally end up with shorter sequences. This means the attention mechanism - the computationally expensive part of these models - has less work to do.
But there’s a catch, and it’s a big one. The Sonar space where these concepts live is incredibly fragile. Make a tiny change to one of these concept vectors, and when you convert it back to text, you might get something completely different. It’s like trying to balance a house of cards - one wrong move and the whole thing falls apart. This gets especially hairy when you’re dealing with complex technical ideas.
Real-World Applications and Limitations
I’ve been playing around with these models, and they’re surprisingly good at summarization. Feed them a long document, and they’ll distill it down to its essence. But ask them to go the other way - to expand a summary into a longer piece - and things get messy. They start repeating themselves, like a nervous speaker who’s run out of things to say.
Limitations:
- The Sonar space is about as stable as a teenage relationship
- These models are stuck thinking one sentence at a time
What’s Next:
- Making the concept space less brittle
- Finding better ways to represent meaning
- Teaching these models to think in paragraphs and chapters, not just sentences
Conclusion
What’s interesting about LCMs isn’t just that they’re a step forward in AI. It’s that they represent a fundamental shift in how we approach machine intelligence. Instead of teaching computers to manipulate symbols, we’re teaching them to work with ideas. This feels important.
The challenges are significant, of course. The concept space is unstable, and the models still think too linearly. But that’s how ambitious projects usually start - with promising but flawed implementations that hint at something bigger.
I suspect we’ll look back at LCMs as one of those ideas that seemed obvious in retrospect. Of course AI systems should work with concepts rather than just words. The surprising thing is that it took us this long to figure out how to do it.

About Sharad Jain
Sharad Jain is an AI Engineer and Data Scientist specializing in enterprise-scale generative AI and NLP. Currently leading AI initiatives at Autoscreen.ai, he has developed ACRUE frameworks and optimized LLM performance at scale. Previously at Meta, Autodesk, and WithJoy.com, he brings extensive experience in machine learning, data analytics, and building scalable AI systems. He holds an MS in Business Analytics from UC Davis.