AI Tools Are Starting to Learn From Their Mistakes

AI tools have a forgetting problem. Every session starts from scratch. The agent that helped you draft a proposal last Tuesday has no memory of what worked, what you corrected, or what your client actually cared about. You get the same starting point every time, and the learning lives entirely in your head.

Anthropic announced something last week that starts to address this. They call it dreaming, and it's worth understanding even if you're not running AI agents in production yet.

What changed

The basic idea: Claude's managed agents now run a scheduled process that reviews past sessions, extracts patterns, and updates the agent's working knowledge between tasks. It's not a new model. The underlying intelligence doesn't change. What changes is that the agent accumulates experience with your specific workflow over time, the same way a new employee eventually stops needing to ask where everything is.

Harvey, a legal AI company, implemented dreaming in production. Their task completion rates went up roughly 6x. That's not a benchmark number from a controlled test. That's a real production system doing more of what it's supposed to do, because it's been refining its approach across sessions instead of starting fresh each time.

Why this matters

The gap between a 6x improvement and what most businesses are actually getting from AI is worth sitting with. Most teams are running the same prompts they set up six months ago, adjusting manually when something doesn't work, and restarting from zero when the context window runs out. The efficiency gains are real, but they're flat. The same tool doing the same thing at roughly the same rate, week after week.

Dreaming suggests a different trajectory: AI tools that compound rather than cruise.

This changes how you should evaluate the tools you're already using. Most AI evaluations are snapshots. You try the tool, see what it produces on day one, and decide if it's worth keeping. That's reasonable when the tool's performance is static. It's a different calculation when the tool might be meaningfully better at your specific tasks in three months, because of what it's learned in the meantime.

A quick test

A practical question to ask about any AI tool you're seriously considering: does it retain context across sessions, or does it forget everything when you close the window? Not every tool needs persistent memory. A writing assistant that helps you polish a paragraph doesn't need to remember your last five sessions. An agent handling customer intake, research workflows, or operational tasks probably does. Those are the tools where improvements like dreaming are going to create real differentiation, and where the gap between tools will widen fast.

This also matters for teams still deciding which AI tools to commit to. Early adopters of tools with continuous improvement capabilities get a compounding head start. The tool learns your workflows, catches patterns in your specific data, and gets better at the exceptions your business generates. A team that starts in January and one that starts in September are not going to end up with the same tool by December, even if they're running the same product.

The bottom line

Anthropic's dreaming feature is currently in research preview, so most businesses won't have access to it in this exact form for a while. But the direction it signals is the more important thing. The next generation of AI tools is being built to get better through experience with your work, not just through model updates. The forgetting problem is starting to get solved.

Whether the tools you're running today are on that trajectory is a harder question than it looks. But it's probably the more useful one to be asking right now.

AI Tools Are Starting to Learn From Their Mistakes

What changed

Why this matters

A quick test

The bottom line

Free: AI Readiness Checklist

Ready to automate your business?