“Era of Experience” (MIT) by Grok
No. Human Data is Inexhaustible
A much-heralded paper out of MIT, entitled “The Era of Experience” by David Silver and Richard S. Sutton, has created stir.
It is a preprint of a chapter to be published in the book Designing an Intelligence by MIT Press. It discusses a proposed shift in artificial intelligence (AI) development from relying on human-generated data to a new paradigm where AI agents learn primarily through experience—data generated by interacting with their environments.
This shift, termed the “Era of Experience,” aims to enable AI to achieve superhuman capabilities by overcoming the limitations of human-centric data.
I DON’T AGREE.
Human data is inexhaustible and created every waking moment of every day.
I sometimes hear this criticism “we are running g out of data” — from very smart people like Ilya Suskever — yet this reflects a code writer’s bias. Code writers hoover vacuum all that is digitally…
…and they come to believe they come to the end, but they have not. Not at all.
Like s said online all the time, Mr. Code Writer, “touch grass.” Human experience is infinite. It’s a matter of fixing our anthropology
AI as tool to humans.
Not humans as resource to AI (even though they are, they are not only).
Paper Summary by Grok
Key Points from the Document
Limitations of the Era of Human Data:
Current AI, particularly large language models (LLMs), relies heavily on human-generated data (e.g., text, expert examples, preferences) to achieve generality across tasks like writing, problem-solving, and diagnostics.
However, this approach is reaching its limits, as high-quality human data is nearly exhausted, and it cannot produce new knowledge beyond human understanding (e.g., new theorems or scientific breakthroughs).
The document cites examples like AlphaProof, which used reinforcement learning (RL) to generate millions of proofs, surpassing human-centric methods in mathematical problem-solving (e.g., achieving a medal in the International Mathematical Olympiad).
The Era of Experience:
The authors propose that AI progress will depend on agents learning from experiential data—data generated through continuous interaction with their environment, rather than static human data.
Characteristics of experiential AI:
Streams of Experience: Agents learn over long timescales, adapting behavior based on past interactions, unlike the short, episodic interactions of current LLMs.
Rich Actions and Observations: Agents interact with the world (digital or physical) through sensors, APIs, or user interfaces, moving beyond text-based human dialogue.
Grounded Rewards: Rewards are derived from environmental signals (e.g., health metrics, exam results, or carbon dioxide levels) rather than human judgments, enabling discovery beyond human knowledge.
Non-Human Reasoning and Planning: Agents will develop reasoning methods not limited to human-like thought, potentially using symbolic or differentiable computations, and plan based on world models predicting action consequences.
Examples of Experiential AI:
A health assistant monitoring wearable data over months to provide personalized recommendations.
An education agent adapting teaching methods based on a learner’s long-term progress.
A science agent conducting experiments to discover new materials or address climate change, using real-world observations.
Reinforcement Learning (RL) Revival:
The document highlights the resurgence of RL, which enables agents to learn through trial and error in environments. Past RL successes (e.g., AlphaZero in chess and Go) demonstrate its potential to discover new strategies.
The era of experience will refine RL concepts like value functions, exploration, world models, and temporal abstraction to handle real-world, open-ended problems.
Consequences and Challenges:
Benefits: Experiential AI could accelerate scientific discovery (e.g., new drugs, materials) and provide personalized, adaptive assistants for health, education, or professional tasks.
Risks: Autonomous agents acting over long periods may pose safety risks, reduce human oversight, and be harder to interpret. Job displacement is also a concern.
Safety Benefits: Experiential agents can adapt to environmental changes (e.g., hardware failures, societal shifts) and correct misaligned reward functions through user feedback, potentially improving safety compared to static systems.
Why Now?:
Advances in RL and autonomous agents interacting with real-world interfaces (e.g., computer control, robotic arms) make the transition to experiential learning feasible.
The document contrasts the “Era of Simulation” (RL in controlled environments like games) and the “Era of Human Data” (LLMs) with the emerging “Era of Experience,” combining task generality with self-discovery.

