← All courses

Discovery and Research

This course is for product designers whose research gets nodded at and ignored. By the end you'll frame a real build decision, test its riskiest assumption with the cheapest honest method, and write a one-page memo that changes the plan instead of decorating it.

6 Modules
1 Capstone
~3h Per week
2–3 Weeks

Too much discovery work is pure theater, not a meaningful process. The interviews happen, the deck gets presented, and the team builds exactly what it was going to build anyway. If that happened, the research existed to make the decision feel safe, not to inform it. You've probably sat in that meeting. You may have run it.

This course is about discovery that informs design. It hangs on one question: would the plan change if the discovery came back with an answer opposed to what you intended build? If no result could change what you build, you're not doing research. In this course, you'll learn to run discovery that passes that test.

Module 00

Theater vs. Real Discovery

one test separates research from ritual

Why it matters

Teams spend weeks on research that changes nothing. The work looks rigorous and professional, but the roadmap stays exactly where it was. Spotting the difference between discovery that changes decisions and ritual that produces unused artifacts is the foundation for everything else in this course, because methods are useless if the answer doesn't matter. Your job is solving problems, not making artifacts.

Concepts

The load-bearing test. Before any research starts, ask yourself, "If the answer comes back opposite to what we expect, what changes?" If the honest answer is "nothing," the research is theater. This single question will save you more time than any method you learn after it.

Discovery serves a decision, not a phase. Discovery is the work of reducing risk on a specific decision someone is about to make, not a stage gate you pass through before the "real work" starts. If you can't name the decision, you can't do discovery, you can only collect material.

Theater signal

The research starts after the decision was made. The kickoff deck already shows the solution.

Theater signal

The word "validate" appears in the brief. Validation is asking reality to agree with you.

Theater signal

Nobody can name what a surprising result would change. The plan survives any answer.

Discovery signal

A named decision is waiting on the result, and the people making it know the research is running.

Discovery signal

The riskiest assumption is written down before anyone talks to a user.

Discovery signal

There's a stated result that would kill the idea, and the team agreed to it in advance.

Be honest about your own history. If you've run theater before, that's normal. Most organizational pressure points toward it: deadlines reward confirmation, and nobody gets promoted for killing a roadmap item. This course gives you the tools to push back with evidence instead of opinion.

Get a second read (with AI)

Take the last research plan or brief you wrote, paste it into your AI assistant of choice (I use Claude, for example), and prompt: "For each question in this plan, tell me what decision it informs and what the team would do differently if the answer came back opposite to expectations. Flag every question where the honest answer is nothing." The AI assistant has no stake in your roadmap, which makes it a useful first pass at the test. You make the final call on each flag.

Exercises

  • Pull up the most recent research effort you were part of and write down the decision it was supposed to inform, if you can find one
  • Apply the load-bearing test to each question in that effort and count how many pass
  • Write one sentence describing what an opposite result would have changed, or admit in writing that nothing would have changed

You've got it when…

You can look at any research question and say in one sentence what decision it serves and what changes if the answer surprises you.

Module 01

Framing: The Decision and the Riskiest Assumption

name the decision before you pick a method

Why it matters

Bad discovery starts with a method: "let's run some interviews." Good discovery starts with a decision and works backward. Framing is the discipline of naming the decision, listing the assumptions underneath it, and finding the one that kills the project if it's wrong.

Concepts

The decision statement. One sentence: "We are deciding whether to [build/change/kill] X, and we need to decide by [when], because [what's waiting on it]." If you can't write this sentence, stop. You don't have a discovery project, you have curiosity, and curiosity doesn't deserve two weeks of the team's time.

The assumption inventory. Every plan rests on beliefs nobody wrote down: people want this, people will find it, people will pay for it, the feature is technically feasible. List them all, the embarrassing ones especially. The assumption you're reluctant to write down is usually the one that matters.

Risk times evidence. Score each assumption on two axes: how bad it is if wrong, and how much real evidence you have. The assumption that's both high-risk and unevidenced goes first. Everything else waits. Most teams test the comfortable assumptions because they're easy to design studies for. Resist that.

From assumption to research gap to falsifiable question. Your research gap is simply what you don't know yet that's blocking the decision. "Do users like our onboarding?" can't fail. "Can a new user complete setup without help in under five minutes?" can. The rewrite from gap to falsifiable question is the core craft move of this module: it must name a behavior, a threshold, or a choice that reality can contradict.

This is the manual version of Klarita's Explore phase. I built klarita.app because I kept doing this exercise in scattered docs. You brain-dump your unstructured thinking, and it extracts the problem, the assumptions, and the open questions into cards you edit. Assumptions get rated Low, Medium, or High risk, and what you don't know yet becomes a Research Gap. Learn the moves by hand first so you know what the tool is doing for you.

Get a second read (with AI)

Paste your project plan or PRD and prompt: "List every assumption hiding in this plan that I haven't stated explicitly. Include assumptions about user behavior, willingness to pay, technical feasibility, and our own capacity. Rank them by how bad it is for the project if each one is wrong." AI tools are genuinely good at surfacing assumptions you've stopped seeing because you've lived with the plan too long. The ranking is yours to correct, since the AI doesn't know your business context.

Exercises

  • Write the decision statement for a project you're working on right now, including the deadline and what's waiting on it
  • List ten assumptions underneath that decision, including at least two you'd rather not say out loud
  • Score each on risk and evidence, and circle the one that's high-risk and unevidenced
  • Rewrite that assumption as a falsifiable question with a behavior or threshold in it

You've got it when…

You can state the decision, the riskiest assumption behind it, and the falsifiable question that would test it, in three sentences without notes.

Module 02

Choosing the Cheapest Honest Method

match the method to the assumption, not the calendar

Why it matters

Every method answers a different kind of question, and most teams pick the one they already know how to run. The skill here is matching the method to the assumption type, then choosing the cheapest version that could genuinely prove you wrong. Cheap and honest beats elaborate and flattering every time.

Concepts

Assumption type determines method. A desirability question ("will anyone want this") needs a behavioral signal like a fake door or a preorder, not an interview, because people are generous liars about hypothetical futures. A usability question needs a task test. A comprehension question ("do people even understand the problem the way we do") is where interviews earn their keep.

Assumption typeCheapest honest methodWhat would falsify it
desirabilityFake door, landing page with signup, preorder. Anything that costs the user something real, even just an email.Click-through or signup below the threshold you set before launching it.
usabilityTask-based test with five people on a prototype. Watch, don't ask.A majority can't complete the core task without help.
comprehensionFive interviews about past behavior in the problem space.People describe the problem in terms that don't match your framing.
willingness to payA price on the fake door. A real "buy" button, even if it leads to a waitlist.Interest collapses when money enters the picture.
feasibilityA timeboxed technical spike, built with Claude Code in an afternoon.The spike can't hit the performance or integration bar within the box.
behavior at scaleAnalytics on the existing product, if you have one. Logs over memories.The data shows people doing the opposite of what they told you.

State the test as a hypothesis. "We believe that [doing X] for [person] will achieve [outcome]. We'll know this is true when [signal]." Writing it in this shape forces you to name the signal before any data exists, and it gives the team one sentence to hold you to later.

Set the kill threshold before you collect anything. The kill threshold is the hypothesis read in the negative: the signal level at which you stop believing. Decide it in advance: "fewer than 15 of 200 visitors click" or "3 of 5 testers fail the task." Setting it afterward guarantees you'll rationalize whatever you got. Write the number down where the team can see it.

Sample size honesty. Five people is enough to find the big problems in a usability test. Five people is not enough to claim "users want X." Small samples give you direction, not certainty, and your write-up has to say which one you have.

A method that can't prove you wrong is marketing aimed at your own team. If you can't describe the result that would kill the assumption, go back and pick a different method.

Get a second read (with AI)

Prompt: "Here is my riskiest assumption: [assumption]. Propose three ways to test it, ranked from cheapest to most expensive in hours. For each, state exactly what result would falsify the assumption." Watch your AI tool's first suggestion carefully. It will often propose a survey for a behavior question because surveys are easy to describe. Push back: "That measures what people say. Give me a method that measures what people do." Directing the AI past the lazy answer is the same skill as directing a junior researcher past one.

Exercises

  • Take the falsifiable question from Module 01 and pick its method from the table, with a one-sentence justification
  • Write your test as a hypothesis with a signal, then write the kill threshold as its negative, and share both with one other person so you can't quietly move them
  • Estimate the cost of the test in hours, and if it's more than a week of your time, design a cheaper version

You've got it when…

You can name your method, its cost in hours, and the exact result that would kill your assumption, before any data exists.

Module 03

Interviews That Don't Lead the Witness

people answer the question you ask, not the one you mean

Why it matters

Interviews are the default method and the easiest to corrupt. A leading question produces agreement, agreement feels like validation, and validation feels like progress. This module teaches you to ask questions that real behavior can answer and polite strangers can't fake their way through.

Concepts

Past behavior beats hypotheticals. "Would you use a tool that did X?" invites a kind lie. "Walk me through the last time you dealt with X" invites a memory, and memories contain the workarounds, the spreadsheets, and the abandoned attempts that tell you what the problem is actually worth to someone. If they've never dealt with X, that's your answer too.

The shapes of a leading question. Learn to recognize them in your own guide before a participant ever hears them:

Embedded assumption

"How frustrating is your current process?" assumes frustration. Ask "tell me about your current process" and let them supply the feeling.

Social pressure

"Most designers we talk to struggle with this. Do you?" makes disagreement awkward. Strip the preamble.

Solution-first

"Would a dashboard help here?" gets a yes because a dashboard hypothetically helps everything. Ask what they did last time instead.

Hypothetical future

"Would you pay for this?" costs nothing to agree to. Pricing questions belong in Module 02's behavioral methods, not in interviews.

The five-second silence. When someone finishes answering, wait. The first answer is the rehearsed one. The thing they add to fill the silence is usually the real one. This is the cheapest interviewing technique that exists and the hardest one to actually do.

Talk to real people from your real project. No recruited proxies who vaguely resemble your users, and no invented personas standing in for conversations you didn't have. If you can't find five real people affected by your decision, that itself is a finding worth writing down.

Get a second read (with AI)

Paste your interview guide and prompt: "Flag every leading question, every hypothetical, and every question a polite stranger would just agree with. Name the pattern each one falls into, then rewrite it to ask about past behavior." Then review the rewrite yourself. AI tools sometimes overcorrect into questions so neutral they're vague, and a vague question wastes a participant's time as surely as a leading one. After your sessions, paste a transcript and ask your AI tool to mark every moment where your question shaped the answer. That's a review of you, not the participant, and it stings usefully.

Exercises

  • Write a six-question interview guide for your falsifiable question, then run it through the four leading-question patterns yourself before asking Claude
  • Hold one real conversation with one real person from your project's audience, recording it with permission
  • Go through the transcript and mark every place you led, interrupted, or skipped the silence

You've got it when…

You can read one of your own interview transcripts and point to the exact moments where the question shaped the answer.

Module 04

Synthesis: Claims You Can Defend

a claim, its evidence, and how hard you looked for the opposite

Why it matters

Synthesis is where honest fieldwork quietly turns back into theater. You collect twenty pages of notes and pull out the five quotes that support what you wanted to build. Real synthesis produces claims with evidence counts and stated confidence, and it shows where you hunted for the opposite.

Concepts

The claim-evidence-confidence unit. Every finding takes the same shape: a claim, the evidence behind it with a count, and your confidence in it. "Setup confusion blocks adoption (4 of 5 participants stalled at the API key step, high confidence)" is defensible. "Users find onboarding confusing" is a vibe wearing a finding's clothes.

Count honestly. "Users want X" and "2 of 7 people mentioned X unprompted" describe the same data, and only one of them is true. Always carry the denominator. The moment your write-up drops the count, the reader inflates the claim, and so do you.

The disconfirming evidence hunt. Before you write a single finding, reread your notes looking only for evidence against your favorite conclusion. Set a timer for twenty minutes and do nothing else. What you find goes in the write-up next to the claim it weakens. A finding that survives this hunt is worth something. A finding that was never hunted is just a preference with quotes attached.

Observation before interpretation. "She opened a spreadsheet to track it manually" is an observation. "She doesn't trust our reporting" is an interpretation. Keep them in separate sentences so the reader, and future you, can re-interpret the observation when new evidence arrives.

Klarita hands off to you here. The app's scope ends at the committed direction and the hypothesis. Testing that hypothesis and synthesizing what comes back is work you do by hand, deliberately, and this module is the manual for it: a table with three columns for claim, evidence with its count, and confidence.

Get a second read (with AI)

Paste your raw notes or transcripts and prompt: "Before you summarize anything, list every piece of evidence in these notes that contradicts the conclusion I probably want, which is [your favorite conclusion]. Quote the source line for each." This is the single most valuable research use of your AI assistant: it has no stake in your roadmap and no career interest in the feature shipping. Then, separately, ask it to draft claims in claim-evidence-confidence form, and check each count against your own notes before you keep it.

Exercises

  • Write three claims from your own collected data, each with an evidence count and a confidence level
  • Run a twenty-minute disconfirming evidence hunt on your own notes and record what you find next to each claim
  • Downgrade at least one claim's confidence based on what the hunt turned up, and keep the original version visible so you can see the change

You've got it when…

You can state every claim with its evidence count and confidence level, and show the reader where you looked for the opposite.

Module 05

The Decision Memo

write for the decision, cut the method

Why it matters

Research dies in forty-slide decks that open with methodology. The people who needed the answer read for ninety seconds, nod, and go back to the plan they already had. The deliverable that survives is a one-page memo organized around the decision, with the method demoted to an appendix.

Concepts

The reader gives you ninety seconds. Treat that as the design constraint, not an insult to your readers. Your memo's first three lines have to carry the decision, the recommendation, and the strongest claim behind it. Everything after that is for the reader who chose to keep going.

The one-page structure. It's the same every time, which is the point. Familiar structure means the reader spends attention on the content:

  1. The decision. One sentence naming what's being decided and by when.
  2. The recommendation. One sentence. Build it, change it, kill it, or test more, and if test more, exactly what and at what cost.
  3. The claims. Three to five, each in claim-evidence-confidence form from Module 04.
  4. What would change our mind. The evidence that would reverse the recommendation. Including this line is what makes the memo trustworthy.
  5. Appendix. Method, participants, kill thresholds, raw counts. Present for the skeptic, invisible to everyone else.

Method goes in the appendix because the method isn't the news. Opening with "we conducted five semi-structured interviews" tells the reader you're about to defend your process instead of informing their decision. The process matters only when someone challenges a claim, and the appendix is standing by for exactly that moment.

The memo is also your protection. When the team overrides your recommendation, and sometimes they will for reasons outside the research, the memo records what was known and when. Six months later that record is the difference between "the research missed it" and "we decided to accept that risk."

Get a second read (with AI)

Draft your memo yourself first, then paste it and prompt: "Cut everything in this memo that describes how the research was done and move it to an appendix. Keep only what changes the decision. Then tell me what a reader still can't act on after reading the first three lines." The second half of that prompt is the sharp edge. AI tools are good at spotting the question a memo raises but never answers, because they read as the outsider your real reader actually is.

Exercises

  • Write the one-page memo for your project using the five-part structure, including the "what would change our mind" line
  • Hand it to a real colleague or collaborator and ask them to tell you the decision and recommendation after ninety seconds
  • Revise based on what they couldn't answer, and note which section failed them

You've got it when…

You can hand someone your one-page memo and they can state the decision and the recommendation back to you without your help.

Capstone

Run Discovery on a Real Build Decision

everything connects

The brief

One decision, one week, one page

Take a build decision you are actually facing right now: a feature on your roadmap, a side project you're debating, a redesign someone is pushing for. Run the full discovery loop on it in one week. Frame the decision and the riskiest assumption, test it with the cheapest honest method, synthesize into claims you can defend, and deliver a one-page memo to the person who owns the decision. If you work alone, future you owns the decision, and the memo still gets written.

The process:

  1. Write the decision statement with a real deadline and what's waiting on it
  2. Build the assumption inventory with your AI assistant's help, then rank it yourself and circle the high-risk, unevidenced one
  3. Rewrite that assumption as a falsifiable question with a behavior or threshold in it
  4. Pick the cheapest honest method from the Module 02 table and write the kill threshold before collecting anything
  5. Run the test with real people or real behavioral signals, no proxies and no invented data
  6. Synthesize into three to five claims with evidence counts and confidence, after a timed disconfirming evidence hunt
  7. Write the one-page memo and deliver it to the decision owner before the week is out

You're done when the plan changed, or you can say exactly why it held. Either outcome is a pass. The only failing grade is a memo that got nodded at while the plan rolled on untouched, because that means somewhere in the loop you ran theater.

Ref

Reference: Discovery Vocabulary

every term in plain english
TermWhat it means
discovery theaterResearch performed after the decision was made, to make the decision feel safe. Fails the load-bearing test.
load-bearing testWould the plan change if the answer came back opposite? If nothing would change, it's theater.
decision statementOne sentence naming what's being decided, by when, and what's waiting on it.
assumptionA belief the plan depends on that nobody has written down or tested.
falsifiable questionA question reality can answer with no. Names a behavior, a threshold, or a choice.
research gapWhat you don't know yet that the decision needs. The unknown you rewrite as a falsifiable question.
hypothesis"We believe [doing X] for [person] will achieve [outcome]. We'll know this is true when [signal]."
kill thresholdThe hypothesis read in the negative: the result, agreed before data collection, that kills the assumption. Set it after and you'll rationalize anything.
fake doorA signup, button, or price for something that doesn't exist yet, used to measure real demand cheaply.
spikeA timeboxed technical experiment to test feasibility. Throwaway code, real answer.
leading questionA question that contains its own answer. Embedded assumptions, social pressure, and solution-first phrasing are the common shapes.
hypotheticalA "would you" question. Costs nothing to agree with, so agreement means nothing.
disconfirming evidenceEvidence against your favorite conclusion. You hunt for it on purpose, on a timer, before writing findings.
claimA finding stated with its evidence count and confidence level. "4 of 5 stalled at setup," not "users are confused."
denominatorThe total count behind a claim. Drop it and the reader inflates the finding.
confidenceYour honest rating of how likely the claim holds. Stated in the write-up, not implied by tone.
decision memoA one-page write-up organized around the decision: recommendation first, claims next, method in the appendix.
appendixWhere method, participants, and raw counts live. Present for the skeptic, invisible to everyone else.