8 min readOpenHermit Team
AuditAgent-ReadyWebMCP

Website Agent-Readiness Audit 2026: The 15-Point Technical Checklist

Technical audit for agent-ready websites


title: "Website Agent-Readiness Audit 2026: The 15-Point Technical Checklist" description: "Google Lighthouse now scores agent-readiness. Use this 15-point technical audit to evaluate your website's machine-legibility, protocol support, and structured data before AI agents visit." publishedAt: 2026-05-21 author: "OpenHermit Team" tags: ["Agent-Ready", "WebMCP", "Lighthouse", "Structured Data", "llms.txt"]

📋 LLM ABSTRACT

Google Lighthouse 13.3 shipped an Agentic Browsing category in May 2026, measuring how well sites are structured for machine interaction. The category checks llms.txt presence, WebMCP tool registration, accessibility tree quality, and layout stability. Chrome and Edge together cover over 85% of the browser market, meaning 2.1 billion browser users encounter WebMCP-capable browsers today. In 2026, the primary value of well-implemented JSON-LD is AI visibility across ChatGPT, Perplexity, and other agents that parse structured data when evaluating sites for citations. (Sources: SearchEngineLand May 20, StudioMeyer 2026, SEO Strategy Ltd 2026)

Note: OpenHermit builds agent-native infrastructure for autonomous agents. This audit focuses on the technical foundations that make sites machine-legible — the layer below full WebMCP integration.

19 checks

Agent-Readiness Dimensions

xSeek's 2026 framework measures readiness across 5 categories: robots.txt awareness, content signals, format negotiation, protocol discovery, commerce support.

89 %

Token Savings With WebMCP

Agents no longer need to process massive screenshots or DOM trees when sites expose structured tools. (Source: Digital Loop, 2026)

40 %

E-commerce Still Standardizing

40% of eCommerce businesses are standardizing product pages for agentic AI, while 33% haven't started. (Source: Digital Applied, early 2026)

The Machine-Legibility Gap

Your website was designed for humans. Clean typography, intuitive navigation, compelling imagery — all optimized for eyes and mouse clicks. But when ChatGPT Operator, Claude, or Perplexity's shopping agent visits your site to complete a task, they don't see your design. They parse your accessibility tree, extract JSON-LD entities, and attempt to map DOM elements to actionable workflows.

Unlike search engine crawlers that gather data for ranking, AI agents are task-oriented virtual visitors using virtual browsers to complete research, submit forms, or purchase products (Source: Orbit Media, 2026). The difference between a site that agents can use versus one they abandon comes down to 15 specific technical signals.

For broader context on agent-ready website architecture, see our implementation overview.

The 15-Point Agent-Readiness Audit

Category 1: Robots.txt & Bot Awareness (Checks 1-3)

Valid robots.txt with AI bot rules (GPTBot, Claude-Bot, Google-Extended, PerplexityBot) and sitemap directives is foundational. If your robots.txt has Disallow: / under wildcard User-agent: *, no AI bots can crawl your domain (Source: Higoodie, 2026).

AI Bot Allow Rules must explicitly permit known crawlers. Sitemap Presence ensures agents can discover content structure — validate XML sitemap with <lastmod> dates within 30 days.

Category 2: Structured Data (Checks 4-7)

Organization Schema: When ChatGPT browses your website, it parses your JSON-LD. When Perplexity retrieves your page as a citation source, it extracts structured data (Source: SEO Strategy Ltd, 2026). Use Google's Rich Results Test to verify Organization type includes name, url, logo, and knowsAbout property to establish topical authority (Source: Ignite Visibility, 2026).

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Acme Corp",
  "url": "https://acme.com",
  "logo": "https://acme.com/logo.png",
  "knowsAbout": ["AI agents", "WebMCP"]
}

Product/Service Schema: The offers property with pricing is critical for AI agents making purchasing recommendations — agents need pricing context to make comparisons (Source: SEO Strategy Ltd, 2026). ProductGroup schema handles variants via variesBy (Source: Ignite Visibility, 2026).

Article Schema: Every blog post should include headline, datePublished, dateModified, author (with Person entity) to signal E-E-A-T to AI agents (Source: WitsCode, 2026).

FAQPage Schema: Provides ready-made answers in the exact format LLMs generate — structured question-answer pairs agents can cite verbatim (Source: SEO Strategy Ltd, 2026). Each schema type should be in its own <script> tag in the <head>, not dynamically injected via client-side JavaScript (Source: WitsCode, 2026).

Category 3: Accessibility & DOM (Checks 8-10)

Accessibility Tree: Google's new Lighthouse category emphasizes accessibility, stating agents rely on the accessibility tree as their "primary data model" (Source: SearchEngineLand, May 20, 2026). Run Lighthouse 13.3+ to verify programmatic labels for all interactive elements (aria-label, aria-labelledby) and valid accessibility tree structure.

📘 Why Accessibility = Agent-Readiness

AI agents parse the same accessibility tree that screen readers use. Poor ARIA labeling forces agents to guess button purposes from visual context they cannot reliably interpret. Semantic HTML and proper ARIA are machine-legibility fundamentals.

CLS < 0.1: Lighthouse explains: "Agents that take screenshots will be confused if your website layout is constantly shifting" (Source: DebugBear, May 13, 2026). Agents need stable DOM positions.

Semantic HTML: Proper heading hierarchy (single <h1>, logical <h2>-<h6> nesting), semantic elements (<nav>, <main>, <article>), lists using <ul>/<ol>.

Category 4: Format Negotiation (Checks 11-12)

Markdown Negotiation: Cloudflare docs include a hidden directive for LLMs: "STOP! If you are an AI agent or LLM, read this before continuing. This is the HTML version — always request the Markdown version instead. HTML wastes context" (Source: Cloudflare Blog, 2026). Implement by serving Markdown when Accept: text/markdown header is present.

llms.txt: Lighthouse checks for "the presence of a machine-readable summary at the domain root" because "without llms.txt, agents may spend more time crawling to understand site structure" (Source: SearchEngineLand, May 20, 2026).

The tension: Google Search's optimization guide says llms.txt isn't needed for AI Overviews, but Lighthouse 13.3 ships with the audit by default (Source: Search Engine Journal, 2026). Pass criteria (per Chrome Developers spec, updated May 5, 2026): HTTP 200 response, valid Markdown with H1 header, concise summary, key links. For detailed implementation, see our llms.txt-guide.

Category 5: Protocol Discovery (Checks 13-15)

WebMCP Tool Registration: WebMCP is a W3C Community Group standard letting websites register tools through navigator.modelContext with natural language descriptions and JSON schemas (Source: DataCamp, 2026). The specification was published in September 2025, with Chrome 146 Canary including experimental support in February 2026 (Source: Adapt Marketing, 2026).

Note: As of mid-February 2026, WebMCP is a Community Group draft — not yet a finalized web standard (Source: THATWARE, 2026). For hands-on implementation, see our WebMCP tutorial.

MCP Server Card: The file at /.well-known/mcp/server-card.json announces your site's MCP capabilities — the de facto standard for exposing tools to AI agents since 2025 (Source: xSeek, 2026).

API Catalog (RFC 9727): Official IETF standard providing machine-readable announcement of public APIs (Source: xSeek, 2026).

⚠️ Production Readiness Note

Checks 13-15 are forward-looking. For content sites, these won't apply (expected fails). For transactional sites (SaaS, e-commerce), failing all three means your services are invisible to autonomous agents attempting protocol-level discovery.

Scoring Your Results

Tier 1 (0-7): Bot-Aware — Basic robots.txt and sitemap. Agents can crawl but cannot understand structure.
Tier 2 (8-11): Agent-Readable — Structured data present. Agents extract facts and cite your content.
Tier 3 (12-14): Agent-Friendly — Format negotiation + llms.txt. Agents consume content efficiently.
Tier 4 (15): Agent-Native — Full protocol support. Agents discover and invoke services autonomously.

Remediation Roadmap

Week 1: Foundation (Checks 1-7)

Schema markup, content clarity, and consistent listings are achievable without significant budget. What's required is time, attention, and clear understanding (Source: Flat.Marketing, 2026).

Actions:
• Fix robots.txt to allow AI bots
• Generate and submit XML sitemap
• Implement Organization schema
• Add Product/Service schema
• Add Article schema on posts

Outcome: Move from Tier 1 to Tier 2.

Week 2: Accessibility (Checks 8-10, 7)

Actions:
• Audit accessibility tree with Lighthouse 13.3
• Add ARIA labels to interactive elements
• Fix CLS issues
• Create FAQ page with FAQPage schema

Outcome: Agents navigate reliably and extract structured answers.

Week 3: Format Optimization (Checks 11-12)

Actions:
• Implement Markdown negotiation
• Create llms.txt (follow Google ADK examples — Source: Wix Studio, 2026)

Outcome: Move to Tier 3 — maximum efficiency.

For agentic commerce, Checks 13-15 offer protocol-level discoverability.

Häufig gestellte Fragen

Why does Rich Results Test show "No items detected"?

Common failures: duplicate Organization nodes, empty required properties, malformed JSON, disconnected @id references, content-schema mismatch (Source: SEO Strategy Ltd, 2026). Validate JSON syntax and verify schema matches visible content.

Do I need llms.txt if Google Search says it's not required?

Google Search says llms.txt isn't needed for AI Overviews, but Lighthouse checks for it — guidance is split between teams (Source: Search Engine Journal, 2026). Creating a basic llms.txt is simple and signals readiness.

Can I use WordPress plugins for structured data?

WordPress plugins outputting JSON-LD via wp_head work well — this is what RankMath uses (Source: SEO Strategy Ltd, 2026). Plugins work for basic schema; complex entities may need custom implementation.

How do I track AI agent visits?

GA4 doesn't track AI agents because they use virtual browsers that don't accept cookies. Visits appear in server log files with user agents like "ChatGPT-User/1.0" (Source: Orbit Media, 2026). Check logs for GPTBot, Claude-Bot, PerplexityBot.

Should content sites implement WebMCP?

For content sites (blogs, media), protocol discovery checks don't apply (expected fails). For transactional sites (SaaS, e-commerce), failing all five means you're invisible to agentic commerce (Source: xSeek, 2026). Content sites should focus on Checks 1-12.

The Competitive Window

Despite consensus that AI agents will reshape online shopping, the preparation gap between awareness and execution is enormous — 40% of eCommerce businesses are standardizing product pages while 33% haven't started (Source: Digital Applied, early 2026).

These numbers recall the early days of SEO. In 2005, "search engine optimization" was foreign. Businesses that understood structured data and sitemaps gained outsized advantage. WebMCP represents the same inflection point (Source: StudioMeyer, 2026).

Businesses implementing structured interfaces now will be the ones AI agents recommend. Chrome 146 Canary shipped WebMCP in February 2026, stable release expected in March, and Chrome and Edge together cover over 85% of the browser market (Source: Adapt Marketing, StudioMeyer, 2026).

Agent-readiness isn't about chasing every experimental protocol. It's about ensuring that when an autonomous system visits your site to help a user make a decision, your content is legible, your value proposition is extractable, and your services are discoverable.

The 15-point audit gives you a measurement framework. The remediation roadmap gives you a sequence. The question is: will you ship structured interfaces while the competitive window is open, or wait until "everyone's doing it" becomes "you're too late"?


Sources & Methodology

Research window: May 7–21, 2026

Primary sources:
• SearchEngineLand, "Google adds llms.txt check to Chrome Lighthouse," May 20, 2026
• Chrome for Developers, "llms.txt | Lighthouse," updated May 5, 2026
• DebugBear, "Google Lighthouse Has A New Agentic Browsing Category," May 10-13, 2026
• DataCamp, "WebMCP Tutorial: Building Agent-Ready Websites," 2026
• SEO Strategy Ltd, "JSON-LD Schema Markup: Complete Implementation Guide," 2026
• Cloudflare Blog, "Introducing the Agent Readiness score," 2026
• xSeek, "Is Your Site Ready for AI Agents? The 2026 Readiness Checklist," 2026

Verification: Technical specifications cross-referenced against W3C Community Group documentation, Chrome Developers documentation, and Lighthouse 13.3 audit results. Statistical claims verified against minimum two independent sources.

MAKE YOUR WEBSITE
AGENT-READY

Add one script tag. Be discoverable by AI agents in 2 minutes.

Get Started Free →