·8 min read
Playwright Best Practices: 10 Rules AI Agents Get Wrong (2026)
The 10 Playwright best practices for stable tests in 2026, and the ones AI code agents like Copilot and Cursor get wrong.

Published: · 5 min read
Playwright CLI v0.1.10 introduces a spec-driven testing skill that guides AI agents through plan/generate/heal workflows for maintaining test suites from written specifications. Network inspection now uses stable request indexing, and raw output is default for all data-fetching commands—eliminating preprocessing steps in CI pipelines.
On this page
Playwright CLI v0.1.10 introduces a spec-driven testing skill that guides AI agents through plan/generate/heal workflows for maintaining test suites from written specifications. Network inspection now uses stable request indexing, and raw output is default for all data-fetching commands—eliminating preprocessing steps in CI pipelines.
Playwright CLI v0.1.10 shipped on April 30th with two headline features that reshape how AI agents interact with browser automation. The network inspection subsystem got a complete overhaul—network is gone, replaced by requests plus granular subcommands that output indexed, pipe-friendly data. But the feature I'm actually excited about is the spec-driven testing skill: a references/spec-driven-testing.md reference that teaches AI agents to drive Playwright tests from written specifications.
This matters because I spend my days building test infrastructure that scales. At CooperVision, I oversaw a 300% increase in test count while hitting 50% faster deployments. That didn't happen by writing more tests manually—it happened by building workflows that let the system do the repetitive work. The spec-driven testing skill is exactly that kind of workflow accelerator.
If you're running AI-augmented QA workflows, you know the problem: agents can generate test code, but without a structured pattern for keeping that code alive, regressions pile up and the suite rots. The spec-driven testing skill provides that structure.
The references/spec-driven-testing.md file outlines a plan/generate/heal cycle. An agent reads a written spec, plans which Playwright assertions map to the spec's behavior, generates the corresponding test code, then heals regressions when the spec changes or the app drifts. This isn't hypothetical—I've seen this pattern work at scale when migrating from Selenium to Playwright, where we achieved 40% faster test execution.
For CI engineers, the network inspection overhaul matters more immediately. The old network command inlined bodies and required brittle string parsing. The new numbered commands (requests, request <num>, request-headers <num>, request-body <num>) output stable indexes and pipe-friendly data. You can pipe directly to jq without stripping wrapper text.
Here's the spec-driven testing workflow in practice. The skill lives at references/spec-driven-testing.md and gets loaded automatically when you're running Playwright CLI with agent-mode enabled:
# Start Playwright CLI with agent mode
npx playwright-cli --agent
# The spec-driven testing skill is loaded from references/
# Agent can now read a spec and generate tests following the pattern:
# 1. PLAN: Map spec requirements to Playwright selectors/assertions
# 2. GENERATE: Write test code
# 3. HEAL: Detect and fix regressions when specs/app change
For the network inspection overhaul, here's a working example:
# List all requests with stable numbered indexes
playwright-cli requests
# Get full details for request #5
playwright-cli request 5
# Extract headers and pipe to jq for CI processing
playwright-cli request-headers 5 | jq '.set-cookie'
# Save response body directly to file
playwright-cli response-body 5 --filename ./debug-response.json
# Run a test that uses the extracted data
playwright-cli test --config ./playwright.config.ts
The raw output change is significant: data-fetching commands like cookie-list, localstorage-list, and route-list now emit unwrapped output by default. Your CI scripts drop 1-2 preprocessing steps per invocation.
The spec-driven testing skill is read-only in this release. You get the reference file and the pattern, but there's no built-in mechanism for the agent to automatically detect spec changes and trigger heals. You have to wire that yourself. In my experience, the first implementation is always manual—you define the trigger conditions, the diff logic, the heal policy. The skill gives you the pattern; you build the automation.
This is honest: don't expect autonomous test maintenance out of the box. Plan for 2-4 weeks of integration work to wire the heal step into your CI pipeline, depending on your test suite size and the stability of your application contract.
Three concrete changes:
The network command is replaced. If you have scripts that parse network output, they need updating:
# Old: playwright-cli network
# New: playwright-cli requests
# Old: parse inline bodies from network output
# New: playwright-cli request-body <num> --filename ./body.txt
Data-fetching commands (cookie-list, route-list, etc.) now emit raw output. If you were stripping the ### Result wrapper, remove that logic—it's gone.
Config-relative path resolution for initPage and initScript now resolves against the config directory (matching Vite/Vitest/ESLint behavior). If you were working around silent load failures, those workarounds are no longer needed.
Playwright CLI v0.1.10 is a meaningful release for AI QA architects and CI engineers alike. The spec-driven testing skill is the headline for teams adopting AI-augmented workflows—it provides the pattern, even if the automation is still DIY. The network inspection overhaul is the practical win for everyone else: stable indexing, pipe-friendly output, raw by default.
My recommendation: upgrade, wire the MCP server stability fixes first (low risk, immediate benefit), then evaluate the network inspection changes for your CI scripts. The spec-driven skill is a longer-term investment—budget the integration time honestly, and you'll get the payoff.
If you're already running Playwright in your CI pipeline, this release makes it easier to script and maintain. If you're building AI-augmented QA workflows, the skill gives you a structure to teach your agents. Both audiences win.
Anton Gulin is an AI QA Architect — the first person to claim this title on LinkedIn. He builds AI-powered test automation systems where AI agents and human engineers collaborate on quality. Former Apple SDET, current Lead Software Engineer in Test at CooperVision. Find him at anton.qa or on LinkedIn.
Get notified when I publish something new, and unsubscribe at any time.
·8 min read
The 10 Playwright best practices for stable tests in 2026, and the ones AI code agents like Copilot and Cursor get wrong.

·3 min read
An honest method for benchmarking LLM speed: tokens per second vs time to first token, across 42 Ollama Cloud models, measured every 10 minutes.

·3 min read
Learn how to generate clean test scripts using Playwright Codegen, and how to scale those drafts into a production-ready test architecture.
