Caching axe-core Browser Binaries in CI Containers

This guide is part of Docker-Based Pipeline Execution, and it targets the slowest, most wasteful part of a containerized accessibility scan: re-downloading a ~150 MB browser binary on every job. When axe-core runs through Playwright or Puppeteer, the browser engine that renders the DOM is a large binary that rarely changes — yet a naive pipeline fetches it cold each run. Baking it into a cached layer (or a restored CI cache volume) typically removes 30–90 seconds from every scan.

Baseline controls this page establishes:

  • A deterministic cache key derived from the lockfile and browser version
  • Browser binaries and the axe-core build baked into a stable image layer
  • A CI cache volume for the Playwright/Puppeteer download directory
  • Explicit invalidation rules so a browser upgrade still refreshes the cache
Browser-binary cache decision flow A cache key from the lockfile and browser version is looked up; a hit restores cached binaries while a miss downloads and repopulates the cache. cache key lockfile + ver cache lookup hit or miss? HIT: restore binaries reused MISS: download ~150 MB fetch repopulate axe-core scan runs
A cache key built from the lockfile and browser version decides whether binaries are restored (hit) or downloaded and repopulated (miss) before the scan runs.

Root Cause / Context

Playwright and Puppeteer do not bundle their browser engines inside the npm package; they download them to a cache directory on first use (~/.cache/ms-playwright for Playwright, ~/.cache/puppeteer for Puppeteer). In a fresh CI container, that directory is empty, so npx playwright install re-fetches the full browser every run. The download dominates the scan job’s wall-clock time, often exceeding the scan itself.

The mistake most teams make is keying the cache on the wrong thing. Keying purely on the lockfile hash means a Playwright patch bump that changes the bundled Chromium silently reuses the wrong binary; keying on nothing means it never restores. The correct key combines the resolved browser version with the lockfile hash, so the cache invalidates exactly when — and only when — the binary actually changes.

Configuration

First, bake the browser into an image layer that is invalidated only by a change to the Playwright version, not by source edits. The axe-core build is part of node_modules, so the same install layer covers it.

# syntax=docker/dockerfile:1.7
FROM node:20.11.1-bookworm-slim AS scanner

ENV DEBIAN_FRONTEND=noninteractive
# Keep Playwright's browser cache at a fixed, cacheable path.
ENV PLAYWRIGHT_BROWSERS_PATH=/ms-playwright

WORKDIR /app
# Manifests first: this layer (and the browser download below) stays cached
# until package-lock.json changes.
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci   # installs @axe-core/playwright + playwright + axe-core build

# Download ONLY the browser we scan with, into the fixed cache path.
RUN npx playwright install --with-deps chromium

COPY . .
RUN npm run build
CMD ["node", "scripts/axe-scan.mjs"]

For CI providers that cache directories between jobs (rather than rebuilding the image each time), restore the Playwright cache with a version-aware key. The example below is GitHub Actions; the key construction transfers to any provider.

- uses: actions/checkout@v4
- uses: actions/setup-node@v4
  with: { node-version: '20', cache: 'npm' }
- run: npm ci
# Resolve the exact bundled browser version for an accurate cache key.
- id: pw
  run: echo "ver=$(node -e "console.log(require('playwright/package.json').version)")" >> "$GITHUB_OUTPUT"
- name: Restore Playwright browser cache
  id: pw-cache
  uses: actions/cache@v4
  with:
    path: ~/.cache/ms-playwright
    # Invalidates only when the Playwright version OR the lockfile changes.
    key: pw-${{ runner.os }}-${{ steps.pw.outputs.ver }}-${{ hashFiles('package-lock.json') }}
- name: Install browser only on cache miss
  if: steps.pw-cache.outputs.cache-hit != 'true'
  run: npx playwright install --with-deps chromium   # skipped on a hit
- name: Run axe-core scan
  run: node scripts/axe-scan.mjs

Validation

Prove the cache is working by checking the restore step’s outcome and confirming the browser was not re-downloaded. On a hit, the install step is skipped entirely.

# In the job log, a cache hit looks like:
#   Cache restored from key: pw-Linux-1.44.0-3f9c...e1
#   Restore Playwright browser cache: cache-hit = true
#   Install browser only on cache miss: skipped

# Locally, verify the binary is present without re-downloading:
ls "$HOME/.cache/ms-playwright" | grep chromium   # e.g. chromium-1124
npx playwright install --dry-run chromium         # reports "is already installed"

To confirm the scan still uses the cached engine and produces WCAG results:

// scripts/axe-scan.mjs — minimal axe-core run proving the cached browser works.
import { chromium } from 'playwright';
import { AxeBuilder } from '@axe-core/playwright';

const browser = await chromium.launch(); // uses the CACHED binary, no download
const page = await browser.newPage();
await page.goto(process.env.SCAN_URL ?? 'http://localhost:3000');
await page.waitForLoadState('networkidle'); // stable DOM before evaluating

const results = await new AxeBuilder({ page })
  .withTags(['wcag2a', 'wcag2aa', 'wcag22aa']) // WCAG 2.2 AA scope
  .analyze();

console.log(`violations: ${results.violations.length}`);
await browser.close();
process.exit(results.violations.length > 0 ? 1 : 0); // non-zero fails the gate

Edge Cases & Conditional Guards

  • Playwright patch bumps: A minor Playwright upgrade can change the bundled Chromium revision. Because the cache key includes the resolved Playwright version, the cache invalidates automatically — never hardcode a static key.
  • Multiple browsers: If you scan in Firefox or WebKit too, install only the engines you use; each adds download weight. Restrict to chromium unless cross-browser coverage is a stated requirement.
  • Shared cache poisoning: When several pipelines share one cache backend, scope the key with runner.os and the project name so an unrelated job cannot restore an incompatible binary.

Pipeline Impact

Caching does not change the scan’s exit code semantics — a non-zero exit on violations still gates the merge — but it materially shortens the job, which keeps the accessibility check off the critical path of the pull request. Faster scans mean reviewers see results sooner and are less tempted to mark the check non-required. For severity-aware gating on top of the cached scan, combine this with Progressive Threshold Management so the saved time is spent on tightening baselines rather than waiting on downloads.

Common Pitfalls

  • Lockfile-only cache key: Misses browser-version drift; a Playwright bump reuses a stale binary. Include the resolved version.
  • Caching the whole node_modules for the browser: Browsers live in a separate cache dir, not node_modules. Cache ~/.cache/ms-playwright explicitly.
  • COPY . . before install in the image: Busts both the dependency and browser-download layers on every source edit. Copy manifests first.
  • Running playwright install unconditionally: Always re-fetches even on a hit. Guard it with if: cache-hit != 'true'.
  • Unbounded cache growth: Old browser revisions accumulate. Periodically rotate the cache namespace to evict stale binaries.

FAQ

Where do Playwright and Puppeteer actually store their browser binaries? Playwright stores them in ~/.cache/ms-playwright (or PLAYWRIGHT_BROWSERS_PATH if set); Puppeteer uses ~/.cache/puppeteer. Those directories — not node_modules — are what you must cache. The axe-core build itself lives in node_modules and is covered by your normal dependency cache.

What belongs in the cache key? The runner OS, the resolved browser-driver version (e.g. the Playwright package version), and the lockfile hash. That combination invalidates the cache exactly when the binary or dependency set changes and reuses it otherwise, avoiding both stale binaries and needless downloads.

Does caching the browser affect scan accuracy? No, as long as the cached binary matches the pinned driver version. Caching only avoids the download; the same browser build renders the DOM, so WCAG 2.2 contrast, focus, and labeling evaluations are byte-for-byte identical to an uncached run.