Smoke Tests with Playwright
For designers and builders who ship prototypes and small apps. Learn to write a handful of tests that catch the embarrassing breaks — the blank page, the dead login, the button that does nothing — before your users do.
A smoke test answers one question: did the thing turn on? The name comes from hardware. You plug in a new board, flip the switch, and if no smoke comes out, you keep going. You're not doing a full feature review — you're confirming the app loads, the core paths work, and nothing is obviously broken.
When you build with Claude Code, you ship fast and change a lot, and a small edit in one place can quietly break a page somewhere else. A smoke suite is what lets you keep moving without something slipping through unnoticed. By the end of this course you'll have a real suite running on every push.
What a Smoke Test Actually Is
Think of a smoke test like the design review where you only ask one question: can the user complete the core task? You're not critiquing spacing or copy, you're confirming the flow is alive. That mindset, applied to code, is what keeps a fast-moving project from shipping a broken homepage.
Smoke tests are wide and shallow. They touch many parts of the app but go only one layer deep. A full test suite might check that a form rejects a malformed email, validates the password length, and shows the right error color. A smoke test just checks that the login page loads and a valid login lands you on the dashboard.
They run fast and run often. A good smoke suite finishes in under a minute, which is the whole point. You want to run it after every meaningful change, so it has to be cheap enough that you never have to think twice about starting it.
Smoke test
A few checks that confirm the app is alive and core paths work.
E2E test
End-to-end. Drives a real browser like a user would.
Unit test
Checks one small function in isolation. Not our focus here.
Regression
When something that used to work quietly breaks again.
The rule of thumb. If you would feel embarrassed shipping a build where this broke, it belongs in your smoke suite. If it is a nice-to-have edge case, it does not.
Before you write a single test, have Claude Code map your app's critical paths. Open your project and prompt: "List the three to five user flows in this app that would be most embarrassing if they broke. For each, describe the steps a user takes from landing to success." You are the one who decides what counts as critical. Claude is good at spotting the flows you forgot you had.
- Pick one app you have built. Write down its three most critical user flows on paper, no code yet.
- For each flow, name the single screen or message that proves it worked (e.g. "the dashboard greeting appears").
- Cross off any flow that is a nice-to-have. You should be left with two or three that truly matter.
You can name the two or three flows in your app that must never break, and for each one you can state the visible proof that it worked.
Install and Your First Run
The hardest part of testing is starting. Playwright removes most of that friction with one install command that sets up the test runner, the browsers, and a sample test you can run immediately. Getting a green checkmark in the first ten minutes builds the momentum that carries you through the rest.
One command installs everything. Run the init command in your project folder. It adds Playwright, downloads the browsers it drives, and scaffolds a config file plus an example test. You answer a couple of prompts and you are done.
# run inside your project folder
npm init playwright@latest
# it asks a few questions — accept the defaults:
# TypeScript? yes. tests folder? e2e. GitHub Actions? yes.
The folders it creates. You get a tests/ or e2e/ folder for your test files, a playwright.config.ts for settings, and an example spec. The config is where you set your base URL and which browsers to run, but the defaults are fine for now.
Run the suite. Use the test command to run headless (no visible browser, fast) or add a flag to watch it happen in a real window.
# run all tests, no visible browser
npx playwright test
# watch it drive a real browser
npx playwright test --headed
# open the interactive UI mode — best for learning
npx playwright test --ui
Use UI mode while you learn. The --ui flag opens a panel where you watch each step run, hover over actions, and see exactly what the page looked like at every moment. It is the single best way to understand what your test is doing.
Let Claude Code do the install and explain what it created. Prompt: "Set up Playwright in this project for end-to-end testing. After installing, walk me through each file and folder it created and what each one is for." Then verify the work yourself by running npx playwright test and confirming the example test passes. If it fails, paste the error back to Claude. Reading the explanation is how you learn the layout, not just get it.
- Install Playwright in a real project of yours and accept the defaults.
- Run the example test headless and confirm you get a passing result.
- Run the same test again with --ui and step through each action in the panel.
- Open playwright.config.ts and find where the base URL is set.
You can install Playwright into a fresh project, run the example test to a passing result, and open UI mode to watch it run step by step.
The Anatomy of a Test
Every test follows the same shape as a user story. The user goes somewhere, does something, and expects to see a result. Once you see that pattern, test code starts to read like a script for a person you are directing.
Three parts, every time. A test navigates to a page, performs an action, then asserts something is true. Here is the smallest useful test you can write.
import { test, expect } from '@playwright/test';
test('homepage loads', async ({ page }) => {
// 1. navigate
await page.goto('/');
// 2. assert — no action needed for a load check
await expect(page).toHaveTitle(/My App/);
});
The page object is your hands. It represents one browser tab. You call methods on it to click, type, and navigate, exactly like a user would. The async and await keywords just mean "wait for this step to finish before the next one." The browser is slower than code, so you wait for it.
Assertions are the whole point. The expect() function is where the test passes or fails. Without an assertion, you have a script that does things but never checks them. A test with no expect can never fail, which makes it worthless.
test()
Wraps one scenario. Takes a name and a function.
page
One browser tab you control. Your hands on the app.
await
Wait for the browser step to finish before moving on.
expect()
The assertion. Where pass and fail are decided.
A test that always passes is a lie. If you write a test and it goes green on the first try, break the app on purpose and run it again. If it still passes, your assertion is checking the wrong thing.
Ask Claude Code to narrate a test in plain English before you trust it. Prompt: "Write a Playwright test that checks my homepage loads, then explain each line as if I have never seen test code." Your job is to read the explanation and confirm the assertion actually proves what you care about. The skill is reading tests critically and catching the ones that don't actually verify anything meaningful.
- Write a test that loads your homepage and asserts the page title.
- Label the three parts in your own test with comments: navigate, act, assert.
- Change the expected title to something wrong and confirm the test fails.
- Fix it back and confirm it passes again.
You can read any Playwright test and point to where it navigates, where it acts, and where it asserts — and you make every test fail on purpose at least once to prove the assertion works.
Finding Things on the Page
Most tests get fragile at exactly this point. How you tell Playwright to find a button determines whether your test survives the next redesign. Use a CSS selector and a tiny restyle can break everything. Use role and text and your tests find elements the same way a person scanning the screen would.
A locator is a way to point at an element. Playwright gives you several strategies. The ones worth using match how a real person identifies things on screen: by the words they see and the role of the element. The ones to avoid depend on internal structure that changes with every restyle.
// BEST — find by what the user sees and the role
page.getByRole('button', { name: 'Sign in' });
page.getByLabel('Email');
page.getByText('Welcome back');
// GOOD — a test id you add on purpose
page.getByTestId('submit-order');
// AVOID — brittle, breaks on any restyle
page.locator('.btn-primary > div:nth-child(2)');
Prefer role and text. A button labeled "Sign in" is "Sign in" whether it is blue or green, in the header or the footer. When you locate by visible text and role, your tests describe intent, and they survive cosmetic change. As a bonus, this nudges you toward accessible markup, because the same roles screen readers use are the ones your tests rely on.
Add a test id when text is ambiguous. If a page has three "Edit" buttons, give the one you mean a data-testid attribute in your markup and target it with getByTestId. A data-testid is a deliberate contract between your code and your tests.
The codegen shortcut. Run npx playwright codegen yoururl.com and click around your app. Playwright writes the locators for you as you interact. It almost always picks role and text-based locators, which makes it a great teacher.
When a generated test uses a brittle CSS selector, push back. Prompt: "Rewrite these locators to use getByRole, getByLabel, or getByText instead of CSS classes. If text alone is ambiguous, tell me which elements need a data-testid added to the markup." You are the editor here. Claude will reach for CSS selectors if you let it, so make the standard explicit and hold it to that standard.
- Run codegen on your app and click through one flow. Read the locators it produced.
- Find any element your text-based locators can't reach cleanly and add a data-testid to it.
- Rewrite one CSS-class locator into a getByRole locator and confirm the test still passes.
You reach for getByRole and getByText first by instinct, and you only add a data-testid when the visible text genuinely can't tell two elements apart.
Writing Your Smoke Suite
Now you turn the critical flows you named in Module 00 into real tests. A small set of them, covering your actual core promises, is worth more than a large suite that doesn't catch anything meaningful. Three good tests beat fifty weak ones.
One file, a few tests, grouped together. Put your smoke tests in a clearly named file like smoke.spec.ts. Group them with test.describe so the suite reads as a unit. Each test covers one flow from start to its visible proof.
import { test, expect } from '@playwright/test';
test.describe('smoke', () => {
test('homepage loads', async ({ page }) => {
await page.goto('/');
await expect(page.getByRole('heading', { level: 1 })).toBeVisible();
});
test('user can sign in', async ({ page }) => {
await page.goto('/login');
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill('correct-password');
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page.getByText('Welcome')).toBeVisible();
});
test('main nav reaches the dashboard', async ({ page }) => {
await page.goto('/');
await page.getByRole('link', { name: 'Dashboard' }).click();
await expect(page).toHaveURL(/dashboard/);
});
});
Notice what these tests don't do. They don't check error messages, validation rules, or styling. They confirm the page loads, login works, and navigation lands you where it should. That restraint is what keeps the suite fast and the failures meaningful.
A failing smoke test means stop. When one of these goes red, something a user would notice is broken. Fix it before you ship.
About that test login. Use a dedicated test account with a known password, ideally on a staging environment. Never hardcode a real user's real credentials into a test file. You will handle secrets properly in Module 06.
Feed Claude the flows you wrote in Module 00 and have it draft the suite. Prompt: "Here are my three critical flows. Write a Playwright smoke suite in one file, one test per flow, using role and label locators. Keep each test to navigate, act, assert — no edge cases." Then run it and read every assertion. Ask of each test: if I deleted this assertion, would the test still catch the break I care about? If not, the assertion is wrong.
- Write a smoke.spec.ts with one test for each of your critical flows.
- Run the full suite and get all tests passing against your local app.
- Break each flow in the app one at a time and confirm the matching test catches it.
- Delete any assertion that doesn't change whether the test catches a real break.
You have a passing smoke suite covering your real critical flows, and you have proven each test fails when you break the flow it guards.
Stable, Not Flaky
A flaky test is one that passes sometimes and fails other times without anything actually changing. It is the worst kind of test, because it trains you to ignore red. Once you start clicking "re-run" on a failure out of habit, your whole suite becomes noise. Stability is what makes a suite worth keeping.
The most common cause of flakiness is timing. Your test clicks a button, then checks for a result that the app needs a moment to render. Playwright handles this for you. Its assertions and actions auto-wait. When you call expect(...).toBeVisible(), it keeps retrying for a few seconds until the element appears or the timeout hits.
Never use a fixed-time pause. The classic mistake is adding a hard wait like waitForTimeout(3000). It makes tests slow on fast machines and still flaky on slow ones. Let Playwright wait on the actual condition instead.
// BAD — guessing at a duration, flaky and slow
await page.waitForTimeout(3000);
await expect(page.getByText('Done')).toBeVisible();
// GOOD — wait on the real thing, auto-retried
await expect(page.getByText('Done')).toBeVisible();
Keep tests independent. Each test should set up its own state and not depend on another test running first. If your sign-in test has to run before your dashboard test, a single failure cascades. Use beforeEach to put the page in a known state before every test.
When something fails, look at the trace. Playwright can record a full trace of a failed run: every action, a screenshot at each step, and the network activity. Open it and you see exactly where reality diverged from your expectation. This turns "why did it fail" from a guessing game into a replay.
# re-run failures with a trace, then open the report
npx playwright test --trace on
npx playwright show-report
Don't paper over flakiness with retries. Playwright can auto-retry failed tests, and that's fine as a safety net in CI. But if a test only passes on retry, treat it as broken and fix the root cause. Retries are a signal, not a solution.
When a test fails in CI, hand Claude the trace and the test. Prompt: "This test failed. Here's the test code and the trace summary. Is this a real bug in my app or a flaky test, and how do you know?" Make Claude justify its answer with evidence from the trace. The judgment you are training is telling a true failure from a timing artifact, because they look identical until you read the replay.
- Search your suite for any waitForTimeout and replace it with an auto-waiting assertion.
- Add a beforeEach that navigates to a known starting page for each test.
- Force a failure, run with --trace on, and open the report to inspect the step-by-step replay.
- Run your full suite five times in a row and confirm it passes every time.
Your suite passes consistently across repeated runs with no fixed-time waits, and you can open a trace to diagnose exactly why any failure happened.
Run It on Every Push
A smoke suite you have to remember to run is one you'll forget to run, usually on the day it mattered most. The whole point is that the tests run automatically on every push and block a broken build before it reaches anyone. That's what turns a suite into a real gate.
Continuous integration runs your tests for you. If you accepted the GitHub Actions option during install, Playwright already wrote a workflow file at .github/workflows/playwright.yml. On every push and pull request, GitHub spins up a fresh machine, installs your app, runs the suite, and reports pass or fail right on the commit.
name: Playwright Tests
on:
push:
branches: [main]
pull_request:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci
- run: npx playwright install --with-deps
- run: npx playwright test
Keep secrets out of your code. Your test login needs a password, but that password must never live in a committed file. Store it as a repository secret in your CI settings and read it from an environment variable in the test. The values stay encrypted and out of your git history.
// read from the environment, never hardcode
await page.getByLabel('Password')
.fill(process.env.TEST_PASSWORD!);
Make the check required. In your repository's branch protection settings, mark the Playwright job as a required status check. Now a pull request cannot merge while the smoke suite is red. The seatbelt buckles itself, every time, whether or not anyone remembers.
Run against a real deployment. For prototypes, point your base URL at your staging or preview deploy. Testing the actual deployed build catches the breaks that only show up outside your local machine, which is exactly where the embarrassing ones hide.
Have Claude wire up the pipeline and the secrets, then verify it yourself by pushing. Prompt: "Set up my Playwright suite to run in GitHub Actions on every push and pull request. Read the test password from a repository secret called TEST_PASSWORD, and tell me exactly where in the GitHub UI to add that secret." Push a commit and watch the check run. The skill is confirming the gate actually blocks a bad merge. Try pushing a deliberate break and make sure CI catches it.
- Confirm a workflow file exists, or have one created, and push to trigger it.
- Move your test password into a repository secret and read it via an environment variable.
- Watch the suite run in CI and confirm a green check appears on your commit.
- Push a deliberate break and confirm CI turns the check red and blocks the merge.
Every push runs your smoke suite automatically, secrets live outside your code, and a red suite blocks a merge without anyone having to remember to check.
Guard a Real App
A smoke suite that runs itself
Take one app you actually care about — a prototype, a side project, one of your shipped apps. Build a smoke suite of three to five tests covering its critical flows, make it stable across repeated runs, and wire it to run on every push so a broken build can never merge quietly. When you are done, you will have turned a real project into one that defends itself.
The process:
- Name the three to five flows that would embarrass you if they broke, and the visible proof of success for each.
- Install Playwright and get the example test passing locally.
- Write one test per flow using role, label, and text locators — navigate, act, assert, nothing more.
- Break each flow in the app and confirm the matching test catches it.
- Remove every fixed-time wait and run the suite five times to prove it is stable.
- Wire it into CI on every push, with the test password stored as a secret.
- Push a deliberate break and confirm the suite blocks the merge, then fix it green.
You're done when a broken core flow can no longer reach your users without your suite turning red first, and you trust that red enough to stop and fix it.