Smoke Tests with Playwright

Module 00

What a Smoke Test Actually Is

did the thing turn on?

Why it matters

Think of a smoke test like a design review with one question on the table: can the user complete the core task? Spacing and copy don't matter here. You're only confirming the flow is alive.

Concepts

Smoke tests are wide and shallow. They touch many parts of the app but go only one layer deep. A full test suite might check that a form rejects a malformed email, validates the password length, and shows the right error color. A smoke test ignores all of that. It checks that the login page loads and a valid login lands you on the dashboard.

They run fast and run often. A good smoke suite finishes in under a minute. That speed is the whole point. You want to run it after every meaningful change, so it has to be cheap enough that you never think twice about starting it.

Smoke test

A few checks that confirm the app is alive and core paths work.

E2E test

End-to-end. Drives a real browser like a user would.

Unit test

Checks one small function in isolation. Not our focus here.

Regression

When something that used to work quietly breaks again.

The rule of thumb. If you would feel embarrassed shipping a build where this broke, it belongs in your smoke suite. Edge cases and optional details belong in a fuller test suite.

Build with Claude Code

Before you write a single test, have Claude Code map your app's critical paths. Open your project and prompt: "List the three to five user flows in this app that would be most embarrassing if they broke. For each, describe the steps a user takes from landing to success." You are the one who decides what counts as critical. Claude is good at spotting the flows you forgot you had.

Exercises

Pick one app you have built. Write down its three most critical user flows on paper, no code yet.
For each flow, name the single screen or message that proves it worked (e.g. "the dashboard greeting appears").
Cross off any flow that is a nice-to-have. You should be left with two or three that truly matter.

You've got it when…

You can name the two or three flows in your app that must never break, and for each one you can state the visible proof that it worked.

Module 01

Install and Your First Run

from zero to a green checkmark

Why it matters

The hardest part of testing is starting. I put it off for years on my own projects. Playwright removes most of that friction with one install command that sets up the test runner, the browsers, and a sample test you can run right away. Get a green checkmark in the first ten minutes and the momentum carries you through the rest.

Concepts

One command installs everything. Run the init command in your project folder. It adds Playwright, downloads the browsers it drives, and scaffolds a config file plus an example test. You answer a couple of prompts and you are done.

Terminal

# run inside your project folder
npm init playwright@latest

# it asks a few questions — accept the defaults:
# TypeScript? yes. tests folder? e2e. GitHub Actions? yes.

The folders it creates. You get a tests/ or e2e/ folder for your test files, a playwright.config.ts for settings, and an example spec. The config holds your base URL and the browsers you run against. The defaults are fine for now.

Run the suite. Use the test command to run headless (no visible browser, fast) or add a flag to watch it happen in a real window.

Terminal

# run all tests, no visible browser
npx playwright test

# watch it drive a real browser
npx playwright test --headed

# open the interactive UI mode — best for learning
npx playwright test --ui

Use UI mode while you learn. The --ui flag opens a panel where you watch each step run, hover over actions, and see exactly what the page looked like at every moment. It is the single best way to understand what your test is doing.

Build with Claude Code

Let Claude Code do the install and explain what it created. Prompt: "Set up Playwright in this project for end-to-end testing. After installing, walk me through each file and folder it created and what each one is for." Then verify the work yourself by running npx playwright test and confirming the example test passes. If it fails, paste the error back to Claude. Reading the explanation is how you learn the layout instead of just ending up with one.

Exercises

Install Playwright in a real project of yours and accept the defaults.
Run the example test headless and confirm you get a passing result.
Run the same test again with --ui and step through each action in the panel.
Open playwright.config.ts and find where the base URL is set.

You've got it when…

You can install Playwright into a fresh project, run the example test to a passing result, and open UI mode to watch it run step by step.

Module 02

The Anatomy of a Test

go somewhere, do something, check something

Why it matters

Every test follows the same shape as a user story: navigate somewhere, do something, confirm the result. Once that pattern clicks, test code reads like a script you hand to an actor. It stops feeling like programming.

Concepts

Three parts, every time. A test navigates to a page, performs an action, then asserts something is true. Here is the smallest useful test you can write.

TypeScript

import { test, expect } from '@playwright/test';

test('homepage loads', async ({ page }) => {
  // 1. navigate
  await page.goto('/');

  // 2. assert — no action needed for a load check
  await expect(page).toHaveTitle(/My App/);
});

The page object is your hands. It represents one browser tab. You call methods on it to click, type, and navigate, exactly like a user would. The async and await keywords just mean "wait for this step to finish before the next one." The browser is slower than code, so you wait for it.

Assertions are the whole point. The expect() function is where the test passes or fails. Without an assertion, you have a script that does things but never checks them. A test with no expect can never fail, which makes it worthless.

test()

Wraps one scenario. Takes a name and a function.

page

One browser tab you control. Your hands on the app.

await

Wait for the browser step to finish before moving on.

expect()

The assertion. Where pass and fail are decided.

A test that always passes is a lie. If you write a test and it goes green on the first try, break the app on purpose and run it again. If it still passes, your assertion is checking the wrong thing.

Build with Claude Code

Ask Claude Code to narrate a test in plain English before you trust it. Prompt: "Write a Playwright test that checks my homepage loads, then explain each line as if I have never seen test code." Your job is to read the explanation and confirm the assertion actually proves what you care about. The skill is reading tests critically and catching the ones that don't actually verify anything meaningful.

Exercises

Write a test that loads your homepage and asserts the page title.
Label the three parts in your own test with comments: navigate, act, assert.
Change the expected title to something wrong and confirm the test fails.
Fix it back and confirm it passes again.

You've got it when…

You can read any Playwright test and point to where it navigates, where it acts, and where it asserts, and you make every test fail on purpose at least once to prove the assertion works.

Module 03

Finding Things on the Page

locators, the way humans see a screen

Why it matters

Most tests get fragile right here. How you tell Playwright to find a button decides whether your test survives the next redesign. Lean on a CSS selector and one small restyle can wipe everything out. Locate by role and text instead, and your tests find elements the same way a person scanning the screen would.

Concepts

A locator is a way to point at an element. Playwright gives you several strategies, and they are not equal. The good ones match how a real person identifies things on screen, by the words they see and the role of the element. The brittle ones lean on internal page structure, which changes every time you restyle.

TypeScript

// BEST — find by what the user sees and the role
page.getByRole('button', { name: 'Sign in' });
page.getByLabel('Email');
page.getByText('Welcome back');

// GOOD — a test id you add on purpose
page.getByTestId('submit-order');

// AVOID — brittle, breaks on any restyle
page.locator('.btn-primary > div:nth-child(2)');

Prefer role and text. A button labeled "Sign in" is "Sign in" whether it is blue or green, in the header or the footer. When you locate by visible text and role, your tests describe intent, and they survive cosmetic change. As a bonus, this nudges you toward accessible markup, because the same roles screen readers use are the ones your tests rely on.

Add a test id when text is ambiguous. If a page has three "Edit" buttons, give the one you mean a data-testid attribute in your markup and target it with getByTestId. A data-testid is a deliberate contract between your code and your tests.

The codegen shortcut. Run npx playwright codegen yoururl.com and click around your app. Playwright writes the locators for you as you interact. It almost always picks role and text-based locators, which makes it a great teacher.

Build with Claude Code

When a generated test uses a brittle CSS selector, push back. Prompt: "Rewrite these locators to use getByRole, getByLabel, or getByText instead of CSS classes. If text alone is ambiguous, tell me which elements need a data-testid added to the markup." You are the editor here. Claude will reach for CSS selectors if you let it, so make the standard explicit and hold it to that standard.

Exercises

Run codegen on your app and click through one flow. Read the locators it produced.
Find any element your text-based locators can't reach cleanly and add a data-testid to it.
Rewrite one CSS-class locator into a getByRole locator and confirm the test still passes.

You've got it when…

You reach for getByRole and getByText first by instinct, and you only add a data-testid when the visible text genuinely can't tell two elements apart.

Module 04

Writing Your Smoke Suite

three tests that cover the things that matter

Why it matters

Now you turn the critical flows you named in Module 00 into real tests. Keep the set small. A handful that cover your actual core promises is worth more than a large suite that never catches anything meaningful. Three good tests beat fifty weak ones.

Concepts

One file, a few tests, grouped together. Put your smoke tests in a clearly named file like smoke.spec.ts. Group them with test.describe so the suite reads as a unit. Each test covers one flow from start to its visible proof.

TypeScript — smoke.spec.ts

import { test, expect } from '@playwright/test';

test.describe('smoke', () => {

  test('homepage loads', async ({ page }) => {
    await page.goto('/');
    await expect(page.getByRole('heading', { level: 1 })).toBeVisible();
  });

  test('user can sign in', async ({ page }) => {
    await page.goto('/login');
    await page.getByLabel('Email').fill('test@example.com');
    await page.getByLabel('Password').fill('correct-password');
    await page.getByRole('button', { name: 'Sign in' }).click();
    await expect(page.getByText('Welcome')).toBeVisible();
  });

  test('main nav reaches the dashboard', async ({ page }) => {
    await page.goto('/');
    await page.getByRole('link', { name: 'Dashboard' }).click();
    await expect(page).toHaveURL(/dashboard/);
  });

});

Notice what these tests don't do. They skip error messages, validation rules, styling, all of it. What they confirm is small: the page loads, a login works, the main nav gets you where it should. That restraint keeps the suite fast and its failures worth reading.

A failing smoke test means stop. When one of these goes red, something a user would notice is broken. Fix it before you ship.

About that test login. Use a dedicated test account with a known password, ideally on a staging environment. Never hardcode a real user's real credentials into a test file. You will handle secrets properly in Module 06.

Build with Claude Code

Feed Claude the flows you wrote in Module 00 and have it draft the suite. Prompt: "Here are my three critical flows. Write a Playwright smoke suite in one file, one test per flow, using role and label locators. Keep each test to navigate, act, assert, with no edge cases." Then run it and read every assertion. Ask of each test: if I deleted this assertion, would the test still catch the break I care about? If not, the assertion is wrong.

Exercises

Write a smoke.spec.ts with one test for each of your critical flows.
Run the full suite and get all tests passing against your local app.
Break each flow in the app one at a time and confirm the matching test catches it.
Delete any assertion that doesn't change whether the test catches a real break.

You've got it when…

You have a passing smoke suite covering your real critical flows, and you have proven each test fails when you break the flow it guards.

Module 05

Stable, Not Flaky

a test you can't trust is worse than no test

Why it matters

A flaky test passes sometimes and fails other times with nothing actually changing. It is the worst kind of test, because it trains you to ignore red. Once you start clicking "re-run" on a failure out of habit, the whole suite turns to noise. I've watched a real break sail through because the suite had cried wolf too many times.

Concepts

The most common cause of flakiness is timing. Your test clicks a button, then checks for a result that the app needs a moment to render. Playwright handles this for you. Its assertions and actions auto-wait. When you call expect(...).toBeVisible(), it keeps retrying for a few seconds until the element appears or the timeout hits.

Never use a fixed-time pause. The classic mistake is adding a hard wait like waitForTimeout(3000). It makes tests slow on fast machines and still flaky on slow ones. Let Playwright wait on the actual condition instead.

TypeScript

// BAD — guessing at a duration, flaky and slow
await page.waitForTimeout(3000);
await expect(page.getByText('Done')).toBeVisible();

// GOOD — wait on the real thing, auto-retried
await expect(page.getByText('Done')).toBeVisible();

Keep tests independent. Each test should set up its own state and not depend on another test running first. If your sign-in test has to run before your dashboard test, a single failure cascades. Use beforeEach to put the page in a known state before every test.

When something fails, look at the trace. Playwright can record a full trace of a failed run: every action, a screenshot at each step, and the network activity. Open it and you see exactly where reality diverged from your expectation. This turns "why did it fail" from a guessing game into a replay.

Terminal

# re-run failures with a trace, then open the report
npx playwright test --trace on
npx playwright show-report

Don't paper over flakiness with retries. Playwright can auto-retry failed tests, and that's fine as a safety net in CI. But if a test only passes on retry, treat it as broken and go find the root cause. A retry buys you a green check and nothing more.

Build with Claude Code

When a test fails in CI, hand Claude the trace and the test. Prompt: "This test failed. Here's the test code and the trace summary. Is this a real bug in my app or a flaky test, and how do you know?" Make Claude justify its answer with evidence from the trace. The judgment you are training is telling a true failure from a timing artifact, because they look identical until you read the replay.

Exercises

Search your suite for any waitForTimeout and replace it with an auto-waiting assertion.
Add a beforeEach that navigates to a known starting page for each test.
Force a failure, run with --trace on, and open the report to inspect the step-by-step replay.
Run your full suite five times in a row and confirm it passes every time.

You've got it when…

Your suite passes consistently across repeated runs with no fixed-time waits, and you can open a trace to diagnose exactly why any failure happened.

Module 06

Run It on Every Push

the seatbelt that buckles itself

Why it matters

A smoke suite you have to remember to run is one you'll forget to run, usually on the day it matters most. So take yourself out of it. The tests should fire automatically on every push and block a broken build before it reaches anyone. That is what turns a suite into a real gate.

Concepts

Continuous integration runs your tests for you. If you accepted the GitHub Actions option during install, the work is half done already. Playwright wrote a workflow file at .github/workflows/playwright.yml. On every push and pull request, GitHub spins up a fresh machine, installs your app, runs the suite, and reports pass or fail right on the commit.

YAML — playwright.yml

name: Playwright Tests
on:
  push:
    branches: [main]
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npx playwright test

Keep secrets out of your code. Your test login needs a password, but that password must never live in a committed file. Store it as a repository secret in your CI settings and read it from an environment variable in the test. The values stay encrypted and out of your git history.

TypeScript

// read from the environment, never hardcode
await page.getByLabel('Password')
  .fill(process.env.TEST_PASSWORD!);

Make the check required. In your repository's branch protection settings, mark the Playwright job as a required status check. Now a pull request cannot merge while the smoke suite is red.

Run against a real deployment. For prototypes, point your base URL at your staging or preview deploy. Testing the actual deployed build catches the breaks that only show up outside your local machine, which is exactly where the embarrassing ones hide.

Build with Claude Code

Have Claude wire up the pipeline and the secrets, then verify it yourself by pushing. Prompt: "Set up my Playwright suite to run in GitHub Actions on every push and pull request. Read the test password from a repository secret called TEST_PASSWORD, and tell me exactly where in the GitHub UI to add that secret." Push a commit and watch the check run. The skill is confirming the gate actually blocks a bad merge. Try pushing a deliberate break and make sure CI catches it.

Exercises

Confirm a workflow file exists, or have one created, and push to trigger it.
Move your test password into a repository secret and read it via an environment variable.
Watch the suite run in CI and confirm a green check appears on your commit.
Push a deliberate break and confirm CI turns the check red and blocks the merge.

You've got it when…

Every push runs your smoke suite automatically, secrets live outside your code, and a red suite blocks a merge without anyone having to remember to check.

Capstone

Guard a Real App

a live smoke suite on one of your projects

The brief

A smoke suite that runs itself

Take one app you actually care about: a prototype, a side project, something you've already shipped. Build a smoke suite of three to five tests covering its critical flows, make it stable across repeated runs, and wire it to run on every push so a broken build can never merge quietly. Then the project defends itself.

The process:

Name the three to five flows that would embarrass you if they broke, and the visible proof of success for each.
Install Playwright and get the example test passing locally.
Write one test per flow using role, label, and text locators: navigate, act, assert, nothing more.
Break each flow in the app and confirm the matching test catches it.
Remove every fixed-time wait and run the suite five times to prove it is stable.
Wire it into CI on every push, with the test password stored as a secret.
Push a deliberate break and confirm the suite blocks the merge, then fix it green.

You're done when a broken core flow can no longer reach your users without your suite turning red first, and you trust that red enough to stop and fix it.

Ref

Playwright Vocabulary Cheat Sheet

keep this open while you write your first suite

Term	What it means
smoke test	A few wide, shallow checks that confirm the app is alive and core paths work.
spec file	A file containing tests, usually ending in .spec.ts.
test()	Defines one test scenario. Takes a name and an async function.
test.describe()	Groups related tests together under a shared name.
beforeEach()	Runs before every test in a group, used to set a known starting state.
page	One browser tab you control. Your hands on the app.
goto()	Navigate the page to a URL.
await	Wait for a browser step to finish before the next line runs.
locator	A way to point at an element on the page.
getByRole()	Find an element by its role and accessible name. The preferred locator.
getByLabel()	Find a form field by its visible label text.
getByText()	Find an element by the text it displays.
getByTestId()	Find an element by a data-testid you added on purpose.
expect()	The assertion. Where a test passes or fails.
toBeVisible()	Asserts an element is present and visible, auto-waiting until it is.
toHaveURL()	Asserts the page is at a given URL.
auto-wait	Playwright retrying an action or assertion until it succeeds or times out.
flaky test	A test that passes and fails inconsistently with no real change.
trace	A recorded replay of a test run: actions, screenshots, and network.
codegen	A tool that writes locators and steps as you click through your app.
headless	Running the browser with no visible window. The CI default.
CI	Continuous integration. A service that runs your tests automatically.
required check	A CI result that must pass before a pull request can merge.
repository secret	An encrypted value like a password, stored in CI settings, never in code.