Courses

Smoke Tests with Playwright

For designers and builders who ship prototypes and small apps. Learn to write a handful of tests that catch the embarrassing breaks — the blank page, the dead login, the button that does nothing — before your users do.

7Modules
1Capstone
~3hPer week
2–3Weeks

A smoke test answers one question: did the thing turn on? The name comes from hardware. You plug in a new board, flip the switch, and if no smoke comes out, you keep going. You're not doing a full feature review — you're confirming the app loads, the core paths work, and nothing is obviously broken.

When you build with Claude Code, you ship fast and change a lot, and a small edit in one place can quietly break a page somewhere else. A smoke suite is what lets you keep moving without something slipping through unnoticed. By the end of this course you'll have a real suite running on every push.

Module 00

What a Smoke Test Actually Is

did the thing turn on?

Think of a smoke test like the design review where you only ask one question: can the user complete the core task? You're not critiquing spacing or copy, you're confirming the flow is alive. That mindset, applied to code, is what keeps a fast-moving project from shipping a broken homepage.

Smoke tests are wide and shallow. They touch many parts of the app but go only one layer deep. A full test suite might check that a form rejects a malformed email, validates the password length, and shows the right error color. A smoke test just checks that the login page loads and a valid login lands you on the dashboard.

They run fast and run often. A good smoke suite finishes in under a minute, which is the whole point. You want to run it after every meaningful change, so it has to be cheap enough that you never have to think twice about starting it.

Smoke test

A few checks that confirm the app is alive and core paths work.

E2E test

End-to-end. Drives a real browser like a user would.

Unit test

Checks one small function in isolation. Not our focus here.

Regression

When something that used to work quietly breaks again.

The rule of thumb. If you would feel embarrassed shipping a build where this broke, it belongs in your smoke suite. If it is a nice-to-have edge case, it does not.

Before you write a single test, have Claude Code map your app's critical paths. Open your project and prompt: "List the three to five user flows in this app that would be most embarrassing if they broke. For each, describe the steps a user takes from landing to success." You are the one who decides what counts as critical. Claude is good at spotting the flows you forgot you had.

  • Pick one app you have built. Write down its three most critical user flows on paper, no code yet.
  • For each flow, name the single screen or message that proves it worked (e.g. "the dashboard greeting appears").
  • Cross off any flow that is a nice-to-have. You should be left with two or three that truly matter.

You can name the two or three flows in your app that must never break, and for each one you can state the visible proof that it worked.

Module 01

Install and Your First Run

from zero to a green checkmark

The hardest part of testing is starting. Playwright removes most of that friction with one install command that sets up the test runner, the browsers, and a sample test you can run immediately. Getting a green checkmark in the first ten minutes builds the momentum that carries you through the rest.

One command installs everything. Run the init command in your project folder. It adds Playwright, downloads the browsers it drives, and scaffolds a config file plus an example test. You answer a couple of prompts and you are done.

Terminal
# run inside your project folder
npm init playwright@latest

# it asks a few questions — accept the defaults:
# TypeScript? yes. tests folder? e2e. GitHub Actions? yes.

The folders it creates. You get a tests/ or e2e/ folder for your test files, a playwright.config.ts for settings, and an example spec. The config is where you set your base URL and which browsers to run, but the defaults are fine for now.

Run the suite. Use the test command to run headless (no visible browser, fast) or add a flag to watch it happen in a real window.

Terminal
# run all tests, no visible browser
npx playwright test

# watch it drive a real browser
npx playwright test --headed

# open the interactive UI mode — best for learning
npx playwright test --ui

Use UI mode while you learn. The --ui flag opens a panel where you watch each step run, hover over actions, and see exactly what the page looked like at every moment. It is the single best way to understand what your test is doing.

Let Claude Code do the install and explain what it created. Prompt: "Set up Playwright in this project for end-to-end testing. After installing, walk me through each file and folder it created and what each one is for." Then verify the work yourself by running npx playwright test and confirming the example test passes. If it fails, paste the error back to Claude. Reading the explanation is how you learn the layout, not just get it.

  • Install Playwright in a real project of yours and accept the defaults.
  • Run the example test headless and confirm you get a passing result.
  • Run the same test again with --ui and step through each action in the panel.
  • Open playwright.config.ts and find where the base URL is set.

You can install Playwright into a fresh project, run the example test to a passing result, and open UI mode to watch it run step by step.

Module 02

The Anatomy of a Test

go somewhere, do something, check something

Every test follows the same shape as a user story. The user goes somewhere, does something, and expects to see a result. Once you see that pattern, test code starts to read like a script for a person you are directing.

Three parts, every time. A test navigates to a page, performs an action, then asserts something is true. Here is the smallest useful test you can write.

TypeScript
import { test, expect } from '@playwright/test';

test('homepage loads', async ({ page }) => {
  // 1. navigate
  await page.goto('/');

  // 2. assert — no action needed for a load check
  await expect(page).toHaveTitle(/My App/);
});

The page object is your hands. It represents one browser tab. You call methods on it to click, type, and navigate, exactly like a user would. The async and await keywords just mean "wait for this step to finish before the next one." The browser is slower than code, so you wait for it.

Assertions are the whole point. The expect() function is where the test passes or fails. Without an assertion, you have a script that does things but never checks them. A test with no expect can never fail, which makes it worthless.

test()

Wraps one scenario. Takes a name and a function.

page

One browser tab you control. Your hands on the app.

await

Wait for the browser step to finish before moving on.

expect()

The assertion. Where pass and fail are decided.

A test that always passes is a lie. If you write a test and it goes green on the first try, break the app on purpose and run it again. If it still passes, your assertion is checking the wrong thing.

Ask Claude Code to narrate a test in plain English before you trust it. Prompt: "Write a Playwright test that checks my homepage loads, then explain each line as if I have never seen test code." Your job is to read the explanation and confirm the assertion actually proves what you care about. The skill is reading tests critically and catching the ones that don't actually verify anything meaningful.

  • Write a test that loads your homepage and asserts the page title.
  • Label the three parts in your own test with comments: navigate, act, assert.
  • Change the expected title to something wrong and confirm the test fails.
  • Fix it back and confirm it passes again.

You can read any Playwright test and point to where it navigates, where it acts, and where it asserts — and you make every test fail on purpose at least once to prove the assertion works.

Module 03

Finding Things on the Page

locators, the way humans see a screen

Most tests get fragile at exactly this point. How you tell Playwright to find a button determines whether your test survives the next redesign. Use a CSS selector and a tiny restyle can break everything. Use role and text and your tests find elements the same way a person scanning the screen would.

A locator is a way to point at an element. Playwright gives you several strategies. The ones worth using match how a real person identifies things on screen: by the words they see and the role of the element. The ones to avoid depend on internal structure that changes with every restyle.

TypeScript
// BEST — find by what the user sees and the role
page.getByRole('button', { name: 'Sign in' });
page.getByLabel('Email');
page.getByText('Welcome back');

// GOOD — a test id you add on purpose
page.getByTestId('submit-order');

// AVOID — brittle, breaks on any restyle
page.locator('.btn-primary > div:nth-child(2)');

Prefer role and text. A button labeled "Sign in" is "Sign in" whether it is blue or green, in the header or the footer. When you locate by visible text and role, your tests describe intent, and they survive cosmetic change. As a bonus, this nudges you toward accessible markup, because the same roles screen readers use are the ones your tests rely on.

Add a test id when text is ambiguous. If a page has three "Edit" buttons, give the one you mean a data-testid attribute in your markup and target it with getByTestId. A data-testid is a deliberate contract between your code and your tests.

The codegen shortcut. Run npx playwright codegen yoururl.com and click around your app. Playwright writes the locators for you as you interact. It almost always picks role and text-based locators, which makes it a great teacher.

When a generated test uses a brittle CSS selector, push back. Prompt: "Rewrite these locators to use getByRole, getByLabel, or getByText instead of CSS classes. If text alone is ambiguous, tell me which elements need a data-testid added to the markup." You are the editor here. Claude will reach for CSS selectors if you let it, so make the standard explicit and hold it to that standard.

  • Run codegen on your app and click through one flow. Read the locators it produced.
  • Find any element your text-based locators can't reach cleanly and add a data-testid to it.
  • Rewrite one CSS-class locator into a getByRole locator and confirm the test still passes.

You reach for getByRole and getByText first by instinct, and you only add a data-testid when the visible text genuinely can't tell two elements apart.

Module 04

Writing Your Smoke Suite

three tests that cover the things that matter

Now you turn the critical flows you named in Module 00 into real tests. A small set of them, covering your actual core promises, is worth more than a large suite that doesn't catch anything meaningful. Three good tests beat fifty weak ones.

One file, a few tests, grouped together. Put your smoke tests in a clearly named file like smoke.spec.ts. Group them with test.describe so the suite reads as a unit. Each test covers one flow from start to its visible proof.

TypeScript — smoke.spec.ts
import { test, expect } from '@playwright/test';

test.describe('smoke', () => {

  test('homepage loads', async ({ page }) => {
    await page.goto('/');
    await expect(page.getByRole('heading', { level: 1 })).toBeVisible();
  });

  test('user can sign in', async ({ page }) => {
    await page.goto('/login');
    await page.getByLabel('Email').fill('test@example.com');
    await page.getByLabel('Password').fill('correct-password');
    await page.getByRole('button', { name: 'Sign in' }).click();
    await expect(page.getByText('Welcome')).toBeVisible();
  });

  test('main nav reaches the dashboard', async ({ page }) => {
    await page.goto('/');
    await page.getByRole('link', { name: 'Dashboard' }).click();
    await expect(page).toHaveURL(/dashboard/);
  });

});

Notice what these tests don't do. They don't check error messages, validation rules, or styling. They confirm the page loads, login works, and navigation lands you where it should. That restraint is what keeps the suite fast and the failures meaningful.

A failing smoke test means stop. When one of these goes red, something a user would notice is broken. Fix it before you ship.

About that test login. Use a dedicated test account with a known password, ideally on a staging environment. Never hardcode a real user's real credentials into a test file. You will handle secrets properly in Module 06.

Feed Claude the flows you wrote in Module 00 and have it draft the suite. Prompt: "Here are my three critical flows. Write a Playwright smoke suite in one file, one test per flow, using role and label locators. Keep each test to navigate, act, assert — no edge cases." Then run it and read every assertion. Ask of each test: if I deleted this assertion, would the test still catch the break I care about? If not, the assertion is wrong.

  • Write a smoke.spec.ts with one test for each of your critical flows.
  • Run the full suite and get all tests passing against your local app.
  • Break each flow in the app one at a time and confirm the matching test catches it.
  • Delete any assertion that doesn't change whether the test catches a real break.

You have a passing smoke suite covering your real critical flows, and you have proven each test fails when you break the flow it guards.

Module 05

Stable, Not Flaky

a test you can't trust is worse than no test

A flaky test is one that passes sometimes and fails other times without anything actually changing. It is the worst kind of test, because it trains you to ignore red. Once you start clicking "re-run" on a failure out of habit, your whole suite becomes noise. Stability is what makes a suite worth keeping.

The most common cause of flakiness is timing. Your test clicks a button, then checks for a result that the app needs a moment to render. Playwright handles this for you. Its assertions and actions auto-wait. When you call expect(...).toBeVisible(), it keeps retrying for a few seconds until the element appears or the timeout hits.

Never use a fixed-time pause. The classic mistake is adding a hard wait like waitForTimeout(3000). It makes tests slow on fast machines and still flaky on slow ones. Let Playwright wait on the actual condition instead.

TypeScript
// BAD — guessing at a duration, flaky and slow
await page.waitForTimeout(3000);
await expect(page.getByText('Done')).toBeVisible();

// GOOD — wait on the real thing, auto-retried
await expect(page.getByText('Done')).toBeVisible();

Keep tests independent. Each test should set up its own state and not depend on another test running first. If your sign-in test has to run before your dashboard test, a single failure cascades. Use beforeEach to put the page in a known state before every test.

When something fails, look at the trace. Playwright can record a full trace of a failed run: every action, a screenshot at each step, and the network activity. Open it and you see exactly where reality diverged from your expectation. This turns "why did it fail" from a guessing game into a replay.

Terminal
# re-run failures with a trace, then open the report
npx playwright test --trace on
npx playwright show-report

Don't paper over flakiness with retries. Playwright can auto-retry failed tests, and that's fine as a safety net in CI. But if a test only passes on retry, treat it as broken and fix the root cause. Retries are a signal, not a solution.

When a test fails in CI, hand Claude the trace and the test. Prompt: "This test failed. Here's the test code and the trace summary. Is this a real bug in my app or a flaky test, and how do you know?" Make Claude justify its answer with evidence from the trace. The judgment you are training is telling a true failure from a timing artifact, because they look identical until you read the replay.

  • Search your suite for any waitForTimeout and replace it with an auto-waiting assertion.
  • Add a beforeEach that navigates to a known starting page for each test.
  • Force a failure, run with --trace on, and open the report to inspect the step-by-step replay.
  • Run your full suite five times in a row and confirm it passes every time.

Your suite passes consistently across repeated runs with no fixed-time waits, and you can open a trace to diagnose exactly why any failure happened.

Module 06

Run It on Every Push

the seatbelt that buckles itself

A smoke suite you have to remember to run is one you'll forget to run, usually on the day it mattered most. The whole point is that the tests run automatically on every push and block a broken build before it reaches anyone. That's what turns a suite into a real gate.

Continuous integration runs your tests for you. If you accepted the GitHub Actions option during install, Playwright already wrote a workflow file at .github/workflows/playwright.yml. On every push and pull request, GitHub spins up a fresh machine, installs your app, runs the suite, and reports pass or fail right on the commit.

YAML — playwright.yml
name: Playwright Tests
on:
  push:
    branches: [main]
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npx playwright test

Keep secrets out of your code. Your test login needs a password, but that password must never live in a committed file. Store it as a repository secret in your CI settings and read it from an environment variable in the test. The values stay encrypted and out of your git history.

TypeScript
// read from the environment, never hardcode
await page.getByLabel('Password')
  .fill(process.env.TEST_PASSWORD!);

Make the check required. In your repository's branch protection settings, mark the Playwright job as a required status check. Now a pull request cannot merge while the smoke suite is red. The seatbelt buckles itself, every time, whether or not anyone remembers.

Run against a real deployment. For prototypes, point your base URL at your staging or preview deploy. Testing the actual deployed build catches the breaks that only show up outside your local machine, which is exactly where the embarrassing ones hide.

Have Claude wire up the pipeline and the secrets, then verify it yourself by pushing. Prompt: "Set up my Playwright suite to run in GitHub Actions on every push and pull request. Read the test password from a repository secret called TEST_PASSWORD, and tell me exactly where in the GitHub UI to add that secret." Push a commit and watch the check run. The skill is confirming the gate actually blocks a bad merge. Try pushing a deliberate break and make sure CI catches it.

  • Confirm a workflow file exists, or have one created, and push to trigger it.
  • Move your test password into a repository secret and read it via an environment variable.
  • Watch the suite run in CI and confirm a green check appears on your commit.
  • Push a deliberate break and confirm CI turns the check red and blocks the merge.

Every push runs your smoke suite automatically, secrets live outside your code, and a red suite blocks a merge without anyone having to remember to check.

★ Capstone

Guard a Real App

a live smoke suite on one of your projects

A smoke suite that runs itself

Take one app you actually care about — a prototype, a side project, one of your shipped apps. Build a smoke suite of three to five tests covering its critical flows, make it stable across repeated runs, and wire it to run on every push so a broken build can never merge quietly. When you are done, you will have turned a real project into one that defends itself.

The process:

  1. Name the three to five flows that would embarrass you if they broke, and the visible proof of success for each.
  2. Install Playwright and get the example test passing locally.
  3. Write one test per flow using role, label, and text locators — navigate, act, assert, nothing more.
  4. Break each flow in the app and confirm the matching test catches it.
  5. Remove every fixed-time wait and run the suite five times to prove it is stable.
  6. Wire it into CI on every push, with the test password stored as a secret.
  7. Push a deliberate break and confirm the suite blocks the merge, then fix it green.

You're done when a broken core flow can no longer reach your users without your suite turning red first, and you trust that red enough to stop and fix it.

Ref

Playwright Vocabulary Cheat Sheet

keep this open while you write your first suite
TermWhat it means
smoke testA few wide, shallow checks that confirm the app is alive and core paths work.
spec fileA file containing tests, usually ending in .spec.ts.
test()Defines one test scenario. Takes a name and an async function.
test.describe()Groups related tests together under a shared name.
beforeEach()Runs before every test in a group, used to set a known starting state.
pageOne browser tab you control. Your hands on the app.
goto()Navigate the page to a URL.
awaitWait for a browser step to finish before the next line runs.
locatorA way to point at an element on the page.
getByRole()Find an element by its role and accessible name. The preferred locator.
getByLabel()Find a form field by its visible label text.
getByText()Find an element by the text it displays.
getByTestId()Find an element by a data-testid you added on purpose.
expect()The assertion. Where a test passes or fails.
toBeVisible()Asserts an element is present and visible, auto-waiting until it is.
toHaveURL()Asserts the page is at a given URL.
auto-waitPlaywright retrying an action or assertion until it succeeds or times out.
flaky testA test that passes and fails inconsistently with no real change.
traceA recorded replay of a test run: actions, screenshots, and network.
codegenA tool that writes locators and steps as you click through your app.
headlessRunning the browser with no visible window. The CI default.
CIContinuous integration. A service that runs your tests automatically.
required checkA CI result that must pass before a pull request can merge.
repository secretAn encrypted value like a password, stored in CI settings, never in code.