Multiple Minds
December 15, 2025
I spent weeks convinced I had solved a theoretical problem.
The problem: how do firms change over time? This is what I study---strategy, organizations, the mechanics of corporate evolution. I’d been wrestling with whether firms change because they discover new information or because their preferences shift. Discovery versus preference change. Two engines of evolution that look identical from the outside and operate by completely different mechanisms.
I had pages of notes. Citations. A framework that reconciled the tension. I kept testing it against cases I knew---firms I’d studied, patterns from the literature---and it kept holding. The feeling was unmistakable: this works.
Then I stopped writing and asked myself the question I’d been avoiding.
What’s the strongest argument against this?
I could generate counterarguments---I’ve been trained to do exactly that. But they felt like sparring partners I’d hired. They went where I expected them to go. They pushed on the points I was already worried about. They didn’t surprise me.
I’d spent weeks building a case, and my brain had become the case. Even my attempts to argue against it were shaped by the same assumptions that built it. I needed friction I hadn’t anticipated.
I used to solve this through other people. Send the draft to a colleague. Call a friend who knows the domain. Wait for someone to say “but what about…” and suddenly the counterargument would crystallize.
But other people have schedules. They have their own problems. They read your draft three days later when you’ve already moved on. And often they’re too polite---they don’t give you the argument, they give you encouragement, which is kind but useless when what you need is friction.
Friction is expensive. It requires someone willing to disagree, knowledgeable enough to disagree well, and available at the moment you need the disagreement. That’s a narrow intersection.
So I started experimenting with simulating it.
The idea came from my Digital Twin setup---the system I described in The Dot Collector. I’d discovered that talking to AI could surface patterns in my own thinking. What I hadn’t tried was getting the AI to argue with itself.
The setup: instead of asking one machine “what do you think of this idea,” I structured multiple perspectives to engage the idea simultaneously. Three agents, each with a different stance. One represents my existing thinking---it reads what I’ve written, absorbs the assumptions I’ve made, and defends the position I’ve taken. The second is the critic. Its job is to find holes---where does the logic break, what evidence would falsify this? The third synthesizes. It watches the debate and asks: what’s actually being contested here? Is there a reformulation that captures what’s right in both positions?
The first time I ran this on my firm evolution framework, the critic immediately asked a question I hadn’t considered: “How do we distinguish, empirically, between discovery and preference change? Both result in the firm doing something different. What observable evidence would tell us which mechanism is operating?”
I’d been treating discovery and preference change as conceptually distinct---and they are---but I hadn’t worked out how you’d actually tell them apart in the world. The framework was internally consistent but empirically underdetermined.
My advocate tried to respond. It offered some possibilities---timing of change relative to information arrival, behavioral consistency before and after---but the response felt weak even coming from my own side. I could feel the framework wobbling.
The synthesizer reframed: “The distinction may matter less for prediction and more for explanation. We need both mechanisms in our theory, but knowing which one operated in a given case might only be possible retrospectively, with access to the firm’s internal deliberations.”
This was genuinely useful. The discovery/preference distinction is a tool for understanding, not for forecasting. That’s a limitation I hadn’t seen because I’d never had to defend the framework against someone pushing on exactly this point.
What makes this work isn’t that the machine is smarter than me. The individual agents don’t have better ideas than I do. But they have different positions, and the structure forces friction into existence.
When you’re thinking alone, you can slide past your own soft spots. The counterarguments you generate are shaped by the same blind spots that shaped the original argument. You’re using the same brain that built the original case. Multiple perspectives---even simulated ones---change the dynamics. The critic doesn’t know you spent weeks building the framework. It just looks for cracks. The synthesizer has one job: find the formulation that survives the debate.
Human reasoning evolved for social contexts---for responding to objections, updating when confronted with evidence, navigating the give-and-take of conversation. We’re less equipped to generate the objections ourselves. The social context is load-bearing. Remove it and something important breaks. The multi-agent debate puts that social structure back in, artificially. Claim, objection, response, synthesis. The rhythm of productive disagreement.
Here’s what I didn’t expect: the fakeness doesn’t matter. When the critic raises an objection I hadn’t considered, my brain responds the same way it does when a colleague raises one. The epistemic update happens regardless of the source. What matters is whether the objection has merit.
I use this for most of my theoretical work now.
The workflow: I write something, usually in a rush, following whatever thread is hot. Then, when the first draft is done and I’m sure it’s brilliant, I run it through the debate. Three perspectives. Genuine friction. The output isn’t a verdict---the machine doesn’t tell me if my idea is good---but it shows me where the pressure points are. Where I’ve been assuming something I shouldn’t assume.
Sometimes the framework survives intact. The critic swings and misses, and I learn that the objections I feared aren’t strong. Confidence earned through trial is different from confidence that’s never been tested.
More often, the framework shifts. The original insight remains, but its boundaries change. I find the careful formulation that survives the stress test, which is almost never the formulation I started with.
The weeks I spent weren’t wasted---the framework was a real insight---but they were incomplete. I needed the multiple minds to see what one mind couldn’t.
Thinking is more dialogical than we tend to admit. It happens in the exchange, in the friction, in the moment when someone says “but what about…” and your assumptions crack open. The question is where the friction comes from, and whether the source matters as much as the structure of the disagreement.
A machine playing devil’s advocate doesn’t have the accumulated knowledge of a human expert who’s spent thirty years in your field. The objections it raises are patterns from training data, not insights born of deep expertise. But the structure---the deliberate staging of disagreement---does most of the work. What you need is any pressure that forces you to respond to something outside your own head. The genius critic is better. The available critic is more useful.
The epistemic update happens regardless of the source.
If you're thinking about similar questions—or building systems that grapple with them—I'd welcome the conversation.
Continue the conversation →