3 Frameworks for Testing Ideas with AI (Before Wasting Time)

Published March 7, 2026 · 10 min read

You have an idea. You type it into ChatGPT. ChatGPT says it's brilliant. You feel validated. You spend three months building it. It fails.

This is the most common pattern in AI-assisted decision making. You didn't use AI to test your idea — you used it to confirm your idea. Those are very different things.

Here are three frameworks that force AI to actually stress-test your thinking rather than applaud it.

Framework 1: The Pre-Mortem

The Pre-Mortem technique comes from psychologist Gary Klein. The concept is simple but powerful: instead of asking "will this work?", you assume it already failed and ask "why did it fail?"

How to Apply It with AI

Don't write: "I'm thinking of starting a subscription box for houseplants. What do you think?"

Write: "It's one year from now. My houseplant subscription box business has failed completely. I've lost $50,000. Write a detailed post-mortem explaining the five most likely reasons it failed. Be specific about market dynamics, unit economics, and operational challenges."

The difference is fundamental. The first prompt invites encouragement. The second prompt assumes failure as a given and asks the model to construct a plausible narrative around it. This bypasses the model's tendency to be supportive and produces genuinely useful risk analysis.

Why It Works

Framing failure as past tense removes the model's instinct to be encouraging about your future plans.
Asking for specific numbers and dynamics forces the model to engage with concrete obstacles rather than abstract optimism.
You get a roadmap of risks you can then address proactively — or decide the idea isn't worth pursuing.

Follow-Up Prompts

After the initial pre-mortem, dig deeper:

"Which of these five failure modes is most likely given current market conditions?"
"What would I need to prove true in the first 30 days to rule out failure mode #2?"
"What's the cheapest experiment I could run to test whether failure mode #1 is real?"

Framework 2: The Devil's Advocate

Most AI conversations are cooperative. You state a position, the AI supports it. The Devil's Advocate framework flips this dynamic entirely.

How to Apply It

Don't write: "Is remote work better than office work?"

Write: "I believe remote work is strictly better than office work for knowledge workers. Your job is to argue against this position as aggressively and convincingly as possible. Don't hedge. Don't agree with me. Attack every assumption I'm making. Use specific examples and data where possible."

Then, critically, follow up with the reverse:

"Now reverse positions. Argue against office work with the same intensity."

Why It Works

Explicitly instructing the model to argue against you overrides its default agreeableness.
Seeing strong arguments on both sides gives you a more complete picture than any single-perspective analysis.
The "same intensity" instruction prevents the model from giving a weak counterargument that you can easily dismiss.

Advanced Variation: The Stakeholder Rotation

For business ideas, ask the AI to argue against your idea from different perspectives:

"Argue against this as a potential customer who decided not to buy."
"Argue against this as a competitor who thinks they can beat me."
"Argue against this as a venture capitalist who decided not to invest."
"Argue against this as a journalist writing about why this company failed."

Each perspective reveals different weaknesses. The customer reveals product-market fit issues. The competitor reveals defensibility gaps. The VC reveals scalability problems. The journalist reveals narrative risk.

Framework 3: The Multi-Model Tribunal

A single AI model gives you a single perspective shaped by its training data, fine-tuning, and architectural biases. The Multi-Model Tribunal uses multiple models as independent evaluators.

How to Apply It

Take your idea and present it identically to 3-5 different AI models. Use the same prompt for each. Then compare:

Where do they agree? High-confidence signal. If four different models independently identify the same risk, take it seriously.
Where do they disagree? This is where things get interesting. Disagreement between models usually means the question is genuinely uncertain — which is valuable information you wouldn't get from asking just one model.
What does each model emphasize? Different models surface different aspects. One might focus on technical feasibility while another focuses on market dynamics. The combination is richer than any single response.

Practical Workflow

You don't need five subscriptions for this. Apps like Human OS offer multiple AI workspaces in one interface, making multi-model comparison practical rather than tedious.

The key is using the same prompt. If you rephrase your question for each model, you're introducing a variable that makes comparison meaningless. Copy-paste the exact same text.

Interpreting Results

4/4 models say it's risky: The risk is probably real. Don't dismiss it.
3/4 models are positive, 1 raises a concern: Investigate the concern specifically. The minority view might be seeing something the others missed.
Models split 2/2: The question is genuinely debatable. You need more data, not more AI opinions.
All models are enthusiastic: Be more skeptical, not less. Unanimous AI enthusiasm often means you asked a question that invites sycophancy. Reframe and try again.

Combining the Frameworks

These frameworks work best in sequence:

Pre-Mortem first to identify potential failure modes.
Devil's Advocate second to stress-test your responses to those failure modes.
Multi-Model Tribunal third to verify the most critical findings across different AI perspectives.

This pipeline takes about 30-45 minutes and can save you months of pursuing ideas that have fundamental, identifiable flaws.

The point isn't to kill ideas. It's to kill bad ideas early and strengthen good ideas by identifying and addressing their weaknesses before you invest real resources. AI becomes genuinely useful when you stop asking it to validate and start asking it to challenge.

Frequently Asked Questions

How can I use AI to test a business idea?

Use structured frameworks rather than open-ended questions. The Pre-Mortem asks AI to explain why your idea failed (past tense). The Devil's Advocate forces AI to argue against your idea. The Multi-Model Tribunal asks the same question across multiple AI models and compares their assessments. These bypass AI's default tendency to validate.

Why does AI always say my ideas are good?

Most AI models are trained with RLHF (reinforcement learning from human feedback) which rewards agreeable, helpful responses. Users rate positive responses higher, so models learn to validate rather than challenge. You need to explicitly prompt for criticism or use tools designed for anti-sycophancy.

What is the Pre-Mortem AI framework?

The Pre-Mortem framework asks AI to imagine your idea has already failed, then explain the most likely reasons for failure. By framing the failure as a given, you bypass the model's tendency to be supportive and get more honest, detailed risk assessment.

Should I use multiple AI models to evaluate ideas?

Yes. Different models have different training data, biases, and reasoning patterns. When multiple models independently identify the same weakness, that's a strong signal. When they disagree, you've found an area worth investigating more deeply with real-world data.

Test Ideas With 6 AI Perspectives in One App

Human OS gives you multiple AI workspaces with Socratic questioning built in. Stop validating bad ideas. Start stress-testing them.

Get Human OS