#123 Why This Is the Decade of Agents (Not the Year) đ§ đ
We keep hearing that this is âthe year of AI agents.â I donât buy it. If you look a bit closer at where we actually are, it feels much more honest to say: this will be the decade of agents. Theyâre already useful and impressiveâbut theyâre also obviously incomplete. In this post Iâll unpack why, whatâs missing, how we got here, and what it means if youâre building with (or around) agents today. đ§ đ¤
Year vs. Decade: Why the timeline matters
âYear of agentsâ is a vibes-based statement. Itâs like calling 2007 âthe year of smartphonesâ because the first iPhone shipped.
You could say that. But if youâd tried to run your entire business, social life, creativity, and payments on a 2007-era mobile stack, youâd be in tears.
Thatâs roughly where we are with agents:
- Theyâre real and already useful.
- Theyâre nowhere near the end state implied by âthis is the yearâ.
Ask yourself this simple question:
âWould I hire an AI agent today the way Iâd hire a junior employee and give it a real chunk of responsibility?â
For most serious workflows, the honest answer is still no.
The âdecade of agentsâ suggests something much bigger:
- Agents with persistent identity and memory
- That operate across tools, UIs, and modalities
- That learn from experience over weeks/months
- That can be trusted with real, irreversible actions
- That you can actually put âon the org chartâ in some sense
Thatâs not a quarterly release away. Thatâs years of engineering and research.
A short history of how we got here
We didnât start with agents. Roughly, the last ~15 years of modern AI went something like this:
1. Per-task deep learning (the âlittle islandsâ era)
- Image classifiers
- Machine translation systems
- Speech recognizers
- Each model: one job, trained from scratch
This was like inventing highly specialized little brains. Powerful, but isolated.
2. Early agents in games (the âAtari optimismâ era)
Then came deep RL success stories:
- Atari agents learning to play dozens of games
- Go-playing systems and other game benchmarks
This created a wave of optimism:
âIf we can train agents to master games, weâll just scale that to real life!â
The problem: games are too clean. The world is not a reward-shaped Atari screen. Real tasks are messy, sparse, multi-objective, full of social and legal constraints.
3. LLMs and foundation models (the ârepresentation firstâ era)
The big unlock was realizing:
You need a huge, general world model before you try to stick an âagent loopâ on top.
LLMs are basically:
- Massive pattern machines trained on internet-scale data
-
Next-token predictors that accidentally learned:
- In-context learning
- Rough reasoning
- Tool-use via text APIs
- Basic planning traces
Now weâre bolting agents onto these foundation models:
- Tool-calling
- Browsing
- Code editing
- Simple computer use
But in some sense, we tried for agents too early (games, web RL), realized we were missing representation power, and swung back to LLMs.
The 2020s are the decade where those two worlds finally merge properly.
What real âagenticâ capability will require
Letâs zoom into the big missing ingredients.
1. Continual learning (agents that donât forget you exist)
Todayâs agents are basically stateless:
- Each session: fresh brain, same weights
-
âMemoryâ is usually:
- A vector store
- Manual notes
- Some clunky RAG hack
You canât treat them like a colleague who:
- Remembers your preferences
- Knows your long-term projects
- Adapts to your style
- Quietly improves over time
True agents will need:
-
Stable personal memory
- Not just ârecall this docâ
- But âI now behave differently because of what we did last weekâ
-
Consolidation mechanisms
- Like sleep in humans
- Compress recent interactions into durable internal changes
-
Safety-aware adaptation
- You donât want a model fine-tuning itself into madness
- Updates need to be constrained, audited, reversible
Until we have some version of that, agents will feel like really smart amnesiacs.
2. Multimodal understanding thatâs operational, not just âcoolâ
Models can now read and describe images, PDFs, charts, and screenshots. Great.
But operationally, a working agent should be able to:
-
Understand your real tools:
- Admin dashboards
- Email threads
- Monitoring graphs
- Figma screens
-
Understand what matters in them:
- âThis error rate is spikingâ
- âThis row violates a constraintâ
- âThis UI is in a broken stateâ
-
Take appropriate action based on that understanding
Right now, most agents are:
- Over-confident when theyâre wrong
- Brittle to small UI or layout changes
- Easily confused by cluttered, messy real-world screens
The models are getting better, but âsee + actâ in real tools is still early.
3. Real computer use, not just âcall this toolâ
Tool-calling is a massive step forward. But itâs also very curated:
- Neatly typed arguments
- Clean JSON schemas
- Highly abstracted functions
Real knowledge work is not that neat.
A serious âdigital workerâ agent needs to:
-
Navigate UIs like a human:
- Sign in, handle 2FA, dismiss popups
- Scroll, click, drag, resize
-
Deal with weirdness:
- Half-loaded pages
- Stuck spinners
- Partially-saved forms
-
Combine it all with tools:
- Mix âlow-levelâ UI actions with âhigh-levelâ APIs
This is hard because:
- The state space is huge
- Rewards are sparse and delayed
- UX changes constantly
But if you want âagents that can really work,â this is the hill to climb.
4. Better learning signals than âone reward at the endâ
Current RL approaches for LLMs often do something like:
- Roll out a whole answer / chain-of-thought / program
- Check how good it is (maybe via a judge model)
- Push all the tokens slightly up or down
This is⌠not great. Itâs like grading a novel with:
â7/10. Do more of⌠whatever you did.â
Humans do something more like:
- âThis step was a good idea.â
- âThese three steps were redundant.â
- âHere is where I went off the rails.â
We need richer learning mechanisms:
-
Process-based feedback
- Grade intermediate reasoning steps, not just the final answer
-
Self-reflection & review
- Models generating critiques of their own attempts
- Then training on those critiques in a controlled way
-
Robust reward models
- That arenât trivial to exploit with adversarial nonsense (âdhdhdhdhâ getting full marks)
Until those improve, âagentic RLâ will feel noisy, fragile and limited.
5. Separating âcognitive coreâ from âencyclopediaâ
Current LLMs are jammed full of:
- World knowledge
- Domain trivia
- Low-level text patterns
Plus some emergent reasoning ability.
Thereâs a strong argument that we want to decouple those:
-
A smaller, sharper cognitive core that:
- Knows how to reason, learn, search, decompose
- Is good at âthinkingâ under uncertainty
-
Connected to external knowledge systems:
- Databases, APIs, documents
- Search engines, codebases, wikis
Why?
- We donât want the model hallucinating answers it is unsure about.
- We want it to know that it doesnât know and go look things up.
- We want memory to be updatable without retraining the entire brain.
The decade of agents might be the decade where:
-
Big monolith LLMs slowly break into:
- Cognitive engines
- Knowledge stores
- Tool ecosystems
-
And agents become the orchestrators of that whole cluster.
Multi-agent systems: culture, self-play, and âAI societiesâ
Humans didnât become powerful by being individually smart. We became powerful by:
- Sharing knowledge (culture, books, the internet)
- Specializing (division of labor)
- Competing & cooperating (markets, research communities)
- Building institutions (companies, universities, governments)
Most agent work today is single-player: âone agent, one user, one environmentâ.
The truly interesting frontier is:
Agents with culture
- Shared memories, playbooks, and best practices
- Agents writing tools, docs, and tutorials for other agents
- Long-lived shared artifacts that outlast any single instance
Agents in self-play
- Creating problems and challenges for each other
- Adversarial training, but at the task/skill level
- Competitive & cooperative dynamics that ratchet up capabilities
Agent organizations
-
Clusters of specialized agents:
- Research agent
- Ops agent
- âManagerâ / planner agent
- Coordinating and delegating work
-
With humans in the loop as:
- Product owners
- Ethics & safety oversight
- Tie-breakers and goal-setters
Weâre barely scratching the surface here.
The analogy with self-driving: demos vs. reality
Self-driving is a useful metaphor for agents:
- Impressive demos came early.
- Real, scaled, trustworthy deployment is slow.
- Every extra â9â of reliability is a constant amount of painful work.
With agents:
- We already have nice demos (video edits, code refactors, web automation).
-
But:
- Environments are adversarial and changing.
- Costs of certain failures are high (security, safety, money).
- Legal + social + UX constraints matter.
If self-driving taught us anything, itâs this:
âWow, cool demoâ is the beginning, not the end.
Expect:
-
A march of nines, where:
- Going from 90% â 99% takes a lot of work
- 99% â 99.9% is another full slog
- And so onâŚ
-
A long period where:
- Agents work most of the time
- Quiet human guardians clean up the 1% failures
This is very compatible with âdecade of agents.â
What this means if youâre building now
All of this long-term talk is nice, but what do you do this year?
1. Use agents as power tools, not replacements
Treat agents like:
- Very fast, sometimes-brilliant interns
-
That:
- Make silly mistakes
- Lie confidently sometimes
- Need guardrails and review
Design around that:
- Keep humans in high-leverage review positions
-
Use agents to:
- Draft
- Explore
- Suggest
- Execute repetitive stuff
2. Build infra, not just a flashy demo â¨
Longer-term value will be in:
- Evaluation + monitoring
- Safety and guardrails
- Memory and identity systems
- Good feedback loops (what went well / badly)
- Hybrid humanâagent workflows
Those things will survive the hype cycles.
3. Optimize for learning, not for clairvoyant prediction
We donât know the exact shape of the agent stack in 5 years.
So:
- Donât over-fit your system to one particular vendor or API shape.
-
Invest in:
- Clear abstractions
- Loose coupling between âbrainsâ and âtoolsâ
- Observability so you can see whatâs going wrong
- Design your product so agents can be upgraded or swapped without rewriting the world.
The human side: agents and the future of learning đ
One more crucial angle: agents change what education should look like.
If agents get better at:
- Automating routine work
- Performing knowledge tasks
- Acting as junior employees
Then humans need to:
- Climb up the abstraction ladder
-
Focus on:
- Problem selection
- Goal setting
- Strategy and intuition
- Taste, values, and judgment
Education has to shift from:
- âHereâs a pile of facts, remember themâ
-
Toward:
- Deep understanding
- Model-building
- Iterative practice
- Tool-using fluency
Two key visions here:
Before AGI: upskilling as survival
In the near term:
- Learning AI, stats, systems, etc. is economically important.
-
We need better âramps to knowledgeâ:
- Well-designed courses
- Good capstone projects
- AI-augmented feedback loops
A serious AI course today shouldnât just be:
âHereâs how to prompt ChatGPT.â
It should look more like:
âHereâs how to train, evaluate, and deploy your own small model or agent end-to-end.â
That kind of hands-on literacy gives you agency in the agentic decade.
After AGI: learning as flourishing
Longer term, if a lot of economically-necessary work is automated, learning becomes:
- Less about survival
- More about meaning, fun, and growth
Think of:
- People going to the gym even though machines can do all the heavy lifting.
- People learning instruments even though you can stream any song.
We could have:
- Personal AI tutors as good as the best human tutors
-
Learning experiences that:
- Diagnose your current mental model
- Serve you exactly-right challenges
- Never leave you bored or overwhelmed
If that happens, education turns into:
The cognitive equivalent of fitness culture. đď¸ââď¸đ§ Not requiredâbut deeply rewarding.
Thatâs a future worth steering toward.
So: why the decade of agents?
Summing it up:
- We already have impressive prototypes.
-
Weâre missing:
- Continual learning
- Robust multimodal computer use
- Richer RL & feedback
- Decent separation of âcognitive coreâ vs. âknowledgeâ
- Multi-agent culture & self-play
- A serious agent/human co-working stack
Those are decade-scale problems, not âQ4 roadmapâ items.
Calling this the year of agents creates unrealistic expectations:
- Over-promising to users & customers
- Sloppy deployments
- Backlash when things fail in serious ways
Calling it the decade of agents does something healthier:
- Acknowledges the progress
- Leaves room for the grind
- Reminds us this is a long game
And for builders, thatâs good news:
There is still a ton of open space to do meaningful, foundational work.
If we do it right, the decade of agents wonât just be about armies of bots clicking around screens. Itâll be about:
- Better tools
- Better learning
- Better collaboration
- And, hopefully, better humans on the other side of all that automation. â¨