Pa11y CI Integration for Automated WCAG Gating

This blueprint is part of Web Accessibility Testing Fundamentals & Tool Selection, and it shows how to wire pa11y-ci into a pipeline as a multi-URL accessibility gate. Pa11y is a thin orchestrator: it loads each URL in a headless browser and hands the rendered DOM to one or more rule engines — most usefully the same axe-core engine you may already run elsewhere, plus the HTML CodeSniffer (htmlcs) ruleset that maps directly to WCAG techniques. The value of pa11y-ci is that it batches many URLs under one threshold and a single exit code, which makes it a clean fit for sitemap-driven, page-based sites.

Pa11y sits between a fast unit-level check and a full audit. It is heavier than a single-component scan but lighter than a complete Lighthouse run, and it shines when you need to sweep dozens of routes against a shared budget. Where it overlaps with axe-core configuration and Lighthouse CI, the deciding factor is coverage breadth versus depth per page.

Key implementation targets:

  • A .pa11yci config defining runners, standard, and a violation threshold
  • A URL list sourced from a sitemap so new routes are covered automatically
  • Dual runners (axe + htmlcs) for complementary WCAG 2.2 rule coverage
  • A single exit code that gates the merge
pa11y-ci data flow from config to threshold gate The pa11y-ci config expands sitemap URLs, runs each through the axe and htmlcs runners, aggregates violations, and compares against a threshold to pass or fail. .pa11yci config sitemap URLs per route axe runner axe-core rules htmlcs runner WCAG techniques threshold gate exit 0 / 1
pa11y-ci expands sitemap URLs, runs each through the axe and htmlcs runners, aggregates violations, and compares against a threshold to set the exit code.

Prerequisites

1. Author the .pa11yci Config

The config file declares the conformance standard, the runners, and the defaults applied to every URL. Setting threshold allows a small, agreed number of known issues per URL while still failing on new ones — a pragmatic baseline for legacy sites. Running both axe and htmlcs widens coverage: axe-core is strong on computed-style checks like contrast (WCAG 2.2 SC 1.4.3), while htmlcs is strong on structural technique mapping.

{
  "defaults": {
    "standard": "WCAG2AA",
    "runners": ["axe", "htmlcs"],
    "threshold": 0,
    "timeout": 30000,
    "chromeLaunchConfig": {
      "args": ["--no-sandbox"]
    },
    "wait": 1000,
    "ignore": [
      "WCAG2AA.Principle1.Guideline1_4.1_4_3.G18.Fail"
    ]
  },
  "urls": [
    "http://localhost:3000/",
    "http://localhost:3000/pricing",
    "http://localhost:3000/contact"
  ]
}

2. Drive URLs From the Sitemap

Hardcoding URLs means new routes silently escape testing. Point pa11y-ci at the sitemap so every published route is swept automatically. The --sitemap flag fetches and expands the sitemap into the URL list at runtime.

# Scan every URL in the sitemap; --sitemap-find/replace rewrites prod hosts
# to the local preview host so the gate runs against the build under test.
npx pa11y-ci \
  --sitemap http://localhost:3000/sitemap.xml \
  --sitemap-find "https://www.example.com" \
  --sitemap-replace "http://localhost:3000" \
  --config .pa11yci

3. Set Thresholds and Interpret Exit Codes

pa11y-ci returns a non-zero exit code when any URL exceeds its threshold, which is what the CI job keys on. Keep the global threshold at 0 for new projects so every violation blocks; for legacy code carrying debt, raise it per URL and ratchet it down over time rather than ignoring whole rules.

# Run with JSON reporting for downstream annotation; tee preserves the human log.
npx pa11y-ci --config .pa11yci --json > reports/pa11y.json
echo "pa11y-ci exit code: $?"   # 0 = under threshold on all URLs, 2 = exceeded

Pipeline Integration

Serve the build, run pa11y-ci, archive the JSON, and let the exit code gate the merge. The job below assumes a static build served on port 3000.

name: pa11y
on: pull_request
jobs:
  pa11y-ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20', cache: 'npm' }
      - run: npm ci
      - run: npm run build
      - run: npx serve dist --listen 3000 &   # serve the build under test
      - name: Run pa11y-ci gate
        run: npx pa11y-ci --sitemap http://localhost:3000/sitemap.xml --json > pa11y.json
      - uses: actions/upload-artifact@v4
        if: always()
        with: { name: pa11y-report, path: pa11y.json }

For per-page severity baselines that tighten over sprints, route the threshold strategy through Progressive Threshold Management. To convert a threshold breach into a soft warning during onboarding, see Auto-Fail vs Warning Workflows. Two companion pages go deeper: a head-to-head on axe-core vs Lighthouse CI for PR gating and a step-by-step on migrating from pa11y to axe-core in CI.

Troubleshooting & Flaky-Test Mitigation

The most common flake is scanning before the page settles. Raise the per-URL wait, or add a actions sequence that waits for a stable selector before the scan begins. Timeouts on heavy routes are the second issue — bump timeout rather than dropping the URL. Duplicate violations across the two runners are expected: axe and htmlcs sometimes flag the same WCAG 2.2 SC 1.3.1 issue with different rule IDs; deduplicate in your reporting layer, not by disabling a runner.

If Chromium fails to launch in CI with a sandbox error, the --no-sandbox arg in chromeLaunchConfig resolves it for unprivileged runners; prefer fixing runner privileges where you can.

Common Pitfalls

  • Hardcoded URL lists: New routes escape testing. Drive from the sitemap so coverage tracks the site.
  • Ignoring whole rules to pass: ignore silences a rule everywhere, masking real regressions. Prefer per-URL thresholds.
  • Scanning file paths: Pa11y needs an HTTP(S) URL. Serve the build first.
  • Single runner by default: Running only one engine narrows coverage. Use axe + htmlcs together.
  • No wait on dynamic pages: Scanning pre-hydration yields phantom or missed violations. Wait for a stable landmark.

FAQ

How is Pa11y different from running axe-core directly? Pa11y is an orchestrator that drives a headless browser across many URLs and can run the axe-core engine and the HTML CodeSniffer engine together under one threshold and exit code. Running axe-core directly gives you finer control per page and richer rule configuration, but you manage the multi-URL batching yourself. Pa11y trades some configurability for batch convenience.

Which runner should I enable — axe, htmlcs, or both? Both, in most cases. The axe runner excels at computed checks like color contrast (WCAG 2.2 SC 1.4.3) and ARIA state, while htmlcs maps closely to documented WCAG techniques and catches some structural issues differently. Together they widen coverage; deduplicate overlapping findings in your reporting rather than dropping a runner.

Where does Pa11y fit next to Lighthouse CI? Pa11y is well suited to sweeping many routes against a shared violation budget, while Lighthouse CI gives a single weighted accessibility score per page plus performance signals. Many teams gate broad route coverage with Pa11y or axe-core and reserve Lighthouse for a representative set of key pages.

In This Section