Multiple Minds
December 15, 2025
I spent weeks convinced I had solved a theoretical problem.
The problem: how do firms change over time? This is what I study—strategy, organizations, the mechanics of corporate evolution. And I’d been wrestling with a specific puzzle about whether firms change because they discover new information or because their preferences shift. Discovery versus preference change. Two fundamentally different engines of evolution.
I had pages of notes. I had citations. I had what felt like a framework that finally reconciled the tension. I kept testing it against cases I knew—firms I’d studied, patterns from the literature—and it kept holding. The feeling was unmistakable: this works.
Then I stopped writing and asked myself the question I’d been avoiding.
What’s the strongest argument against this?
I could generate counterarguments—I’ve been trained to do exactly that. But the counterarguments I generated felt like sparring partners I’d hired. They went where I expected them to go. They pushed on the points I was already worried about. They didn’t surprise me.
I’d spent weeks building a case, and my brain had become the case. Even my attempts to argue against it were shaped by the same assumptions that built it. I needed friction I hadn’t anticipated.
My internal reasoning engine—the one that’s supposed to work through problems silently, weighing evidence, considering alternatives—runs hot but narrow. Once I’m on a track, I’m on a track. The momentum of my own thinking becomes the obstacle to better thinking.
I used to solve this through other people. Send the draft to a colleague. Call a friend who knows the domain. Wait for someone to say “but what about…” and suddenly the counterargument would crystallize. The external voice would shake something loose.
But other people have schedules. They have their own problems. They read your draft three days later when you’ve already moved on. And often they’re too polite—they don’t give you the argument, they give you encouragement, which is kind but useless when what you need is friction.
Friction is expensive. It requires someone willing to disagree, knowledgeable enough to disagree well, and available at the moment you need the disagreement. That’s a narrow intersection.
So I started experimenting with simulating it.
The idea came from my Digital Twin setup — the system I described in The Dot Collector. I’d already discovered that talking to AI could surface patterns in my own thinking. What I hadn’t tried was getting the AI to argue with itself.
The setup looks like this.
Instead of asking one machine “what do you think of this idea,” I set up a structure where multiple perspectives engage the idea simultaneously. Three agents, each with a different stance.
Agent One represents my existing thinking. It reads what I’ve already written, absorbs the assumptions I’ve already made, and defends the position I’ve taken. It’s my advocate—the voice that knows why I believe what I believe.
Agent Two is the devil’s advocate. Its job is to find the holes. Not to be contrarian for contrarian’s sake, but to genuinely stress-test the argument. What’s the strongest objection? Where does the logic break? What evidence would falsify this?
Agent Three synthesizes. It watches the debate between One and Two and asks: what’s actually being contested here? Are they arguing past each other? Is there a reformulation that captures what’s right in both positions?
The first time I ran this on my firm evolution framework, Agent Two—the critic—immediately asked a question I hadn’t considered: “How do we distinguish, empirically, between discovery and preference change? Both result in the firm doing something different. What observable evidence would tell us which mechanism is operating?”
I read the question and felt the jolt of realizing you’ve been fuzzy where you thought you were precise. I’d been treating discovery and preference change as conceptually distinct—and they are—but I hadn’t worked out how you’d actually tell them apart in the world. The framework was internally consistent but empirically underdetermined.
Agent One (representing my thinking) tried to respond. It offered some possibilities—timing of change relative to information arrival, behavioral consistency before and after—but the response felt weak even coming from my own advocate. I could feel the framework wobbling.
Agent Three stepped in and reframed the debate: “The distinction may matter less for prediction and more for explanation. We need both mechanisms in our theory, but knowing which one operated in a given case might only be possible retrospectively, with access to the firm’s internal deliberations.”
This was genuinely useful. It didn’t refute my framework; it revealed where it lived. The discovery/preference distinction is a tool for understanding, not for forecasting. That’s a limitation I hadn’t seen because I’d never had to defend the framework against someone pushing on exactly this point.
What makes this work isn’t that the machine is smarter than me. It isn’t. The individual agents don’t have better ideas than I do. But they have different positions, and the structure forces friction into existence.
When you’re thinking alone, you can slide past your own soft spots. The counterarguments you generate are shaped by the same blind spots that shaped the original argument. Even when you try to play devil’s advocate in your head, you’re using the same brain that built the original case. You need friction you didn’t see coming.
Multiple perspectives—even simulated ones—change the dynamics. The devil’s advocate doesn’t care about your emotional attachment to the framework. It doesn’t know you spent weeks building it. It just looks for the cracks. The synthesis agent has one job: find the formulation that survives the debate. The structure creates friction that wouldn’t exist otherwise.
I’m a PhD student. I’ve spent years training to think carefully about ideas. And here I am, outsourcing the stress-test to machines because my own cognitive architecture can’t handle it.
That’s the point. The architecture is the problem.
Human reasoning evolved for contexts very different from solo theoretical work. We’re good at thinking socially—at responding to objections, updating when confronted with evidence, navigating the give-and-take of conversation. We’re not as good at generating the objections ourselves. The social context is load-bearing. Remove it and something important breaks.
What the multi-agent debate does is put the social context back in, artificially. I’m not conversing with humans, but I’m conversing with something that creates the structure of intellectual exchange. Claim, objection, response, synthesis. The rhythm of productive disagreement.
Here’s what I didn’t expect: it doesn’t feel fake. Or rather, the fakeness doesn’t matter. When Agent Two raises an objection I hadn’t considered, my brain responds the same way it does when a colleague raises an objection I hadn’t considered. The epistemic update happens regardless of the source. What matters is whether the objection has merit, not whether it came from carbon or silicon.
I use this for most of my theoretical work now.
The workflow looks like this: I write something, usually in a rush, following whatever thread is hot. Then, when the first draft is done and I’m sure it’s brilliant, I run it through the debate. Three perspectives. Genuine friction. The output isn’t a verdict—the machine doesn’t tell me if my idea is good—but it shows me where the pressure points are. Where the argument needs reinforcement. Where I’ve been assuming something I shouldn’t assume.
Sometimes the framework survives intact. The devil’s advocate swings and misses, and I learn that the objections I feared aren’t actually that strong. That’s valuable too. Confidence earned through trial is different from confidence that’s never been tested.
More often, the framework shifts. The original insight remains, but its boundaries change. I learn where it applies and where it doesn’t. I find the careful formulation that survives the stress test, which is almost never the formulation I started with.
The weeks I spent weren’t wasted—the framework was a real insight—but they were incomplete. I needed the multiple minds to see what one mind couldn’t.
We tend to treat thought as private, interior, monological. The thinker and the thought, alone in a room. And sometimes that works. But more often than we acknowledge, thinking is dialogical. It happens in the exchange, in the friction, in the moment when someone says “but what about…” and your assumptions crack open.
You need other perspectives to think well. The question is where they come from — and whether the source matters as much as the structure of the disagreement. Historically, the answer was: other people. Find smart friends, join seminars, send drafts to colleagues.
But other people are scarce. Their time is finite. They’re not available at 2 AM when you’re stuck on a problem.
Simulated perspectives aren’t as good as the real thing. A machine playing devil’s advocate doesn’t have the accumulated knowledge of a human expert who’s spent thirty years in your field. The objections it raises are patterns from training data, not insights born of deep expertise.
But something is often better than nothing. And the structure—the deliberate staging of disagreement—does most of the work. You don’t need a genius critic. You need any critic that forces you to respond to something outside your own head.
If you're thinking about similar questions—or building systems that grapple with them—I'd welcome the conversation.
Continue the conversation →