Overview Weighted score across 5 categories

Score
39/ 100
scoring v1.1
Luxfaber Score measures how cleanly AI agents can read and interpret this page. Scored across 5 categories — crawl accessibility, structured data, semantic HTML, content clarity, determinism. The breakdown below shows where pressure concentrates so you know what to fix first.
Crawl Accessibility · 25%
73
Structured Data · 22%
0
Semantic HTML · 18%
0
Content Clarity · 17%
15
Determinism · 18%
100

Gates

All gates passed.
  • no-html
    Primary fetch returned HTTP 200 with non-empty body
  • robots-disallow-all
    At least one AI bot is allowed in robots.txt
  • bot-block
    At least one bot UA fetch succeeded without challenge

Quick wins

Highest score-gain-per-minute fixes for this page.

+18.0 pts ~1m html.lang-attr

<html> element is missing a lang attribute

Fix: Add lang="en" (or appropriate language code) to the <html> element

<html lang="en">
+22.0 pts ~2m sd.twitter-card.present

No twitter:card meta tag found

Fix: Add <meta name="twitter:card" content="summary_large_image">

<meta name="twitter:card" content="summary_large_image">
+25.0 pts ~3m crawl.canonical.matches

Cannot check canonical match — no canonical tag present

Fix: Add <link rel="canonical" href="..."> matching the final URL

<link rel="canonical" href="https://expressjs.com">
+25.0 pts ~3m crawl.canonical.present

No <link rel="canonical"> found in <head>

Fix: Add <link rel="canonical" href="..."> to the <head>

<link rel="canonical" href="https://expressjs.com">
+22.0 pts ~5m sd.title-meta.quality

Issues: title missing; meta description missing

Fix: Title should be 10–60 chars; meta description should be 50–160 chars

<title>Your page title</title>
<meta name="description" content="One-sentence summary of what this page is — 50–160 chars, written for both humans and LLM agents.">

All findings

Crawl Accessibility (22)
FAIL crawl.ai-plugin-manifest.present ~60m score: 0

/.well-known/ai-plugin.json not found

Fix: Publish /.well-known/ai-plugin.json to declare your ChatGPT/OpenAI plugin metadata

Why: /.well-known/ai-plugin.json was defined by OpenAI for ChatGPT plugins. Many agentic systems still look for this file to auto-discover capabilities — it's effectively the /robots.txt of agent APIs.

Copy-paste fix
// /.well-known/ai-plugin.json
{
  "schema_version": "v1",
  "name_for_human": "Your Product",
  "name_for_model": "your_product",
  "description_for_model": "API for Your Product. Use to answer questions about this product.",
  "api": {
    "type": "openapi",
    "url": "https://expressjs.com/openapi.json"
  }
}
INFO crawl.auth-wall ~0m score: 100

No auth-wall signals detected

Why: Auth-walled pages require login before crawlers can read content. Flagged for awareness — most AI agents can't authenticate and will see only the login form.

PASS crawl.bot-block ~60m score: 100

No bot-blocking signals detected

Why: WAF challenges that don't recognize legitimate AI bots silently lock you out of LLM training/inference.

FAIL crawl.canonical.matches ~3m score: 0

Cannot check canonical match — no canonical tag present

Fix: Add <link rel="canonical" href="..."> matching the final URL

Why: Mismatched canonical sends agents to a different URL than the one they fetched.

Copy-paste fix
<link rel="canonical" href="https://expressjs.com">
FAIL crawl.canonical.present ~3m score: 0

No <link rel="canonical"> found in <head>

Fix: Add <link rel="canonical" href="..."> to the <head>

Why: LLM crawlers use canonical to dedupe and pick the authoritative URL.

Copy-paste fix
<link rel="canonical" href="https://expressjs.com">
INFO crawl.hreflang-alternates ~30m score: 100

No hreflang alternate links found

Why: hreflang tells agents which language/region variant is canonical for each locale, reducing duplicate-content signals across international site variations.

PASS crawl.http-status ~30m score: 100

Primary fetch returned HTTP 200

Why: Non-2xx responses tell agents the page doesn't exist or is broken — score capped at 0.

PASS crawl.js-divergence ~120m score: 100

Static and browser bodies within 0% — no significant JS divergence

Why: When static HTML and browser-rendered HTML diverge heavily, AI crawlers that can't execute JS only see the shell — structured data, headings, and content are invisible.

PASS crawl.llms-txt.present ~30m score: 100

llms.txt is present

Why: /llms.txt is the emerging standard for site-level guidance to LLM agents.

Copy-paste fix
# https://expressjs.com/llms.txt
# (https://llmstxt.org)

# Your page title

> One-paragraph site summary aimed at an LLM agent.

## Docs
- [Getting Started](https://expressjs.com/docs)

## Optional
- [Changelog](https://expressjs.com/changelog)
FAIL crawl.mcp-manifest.present ~60m score: 0

/.well-known/mcp.json not found

Fix: Publish /.well-known/mcp.json to declare MCP tool endpoints for AI agent discovery

Why: /.well-known/mcp.json is the emerging discovery endpoint for MCP (Model Context Protocol) tool servers. AI agents that support MCP look here first before prompting a user to configure a server manually.

Copy-paste fix
// /.well-known/mcp.json
{
  "version": "1.0",
  "name": "Your Product",
  "tools": [
    {
      "name": "search",
      "url": "https://expressjs.com/mcp/search"
    }
  ]
}
PASS crawl.redirect-chain ~15m score: 100

Redirect chain length: 0

Why: Long redirect chains burn crawl budget and can lose query/state across hops.

PASS crawl.robots.bot-allow.applebot-extended ~5m score: 100

applebot-extended is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.bot-allow.bytespider ~5m score: 100

bytespider is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.bot-allow.ccbot ~5m score: 100

ccbot is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.bot-allow.claudebot ~5m score: 100

claudebot is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.bot-allow.google-extended ~5m score: 100

google-extended is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.bot-allow.gptbot ~5m score: 100

gptbot is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.bot-allow.perplexitybot ~5m score: 100

perplexitybot is allowed in robots.txt

Why: Explicitly allowing AI crawlers in robots.txt removes the ambiguity that defaults to 'block' in some agent stacks.

Copy-paste fix
# in robots.txt
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /
PASS crawl.robots.present ~5m score: 100

robots.txt is present

Why: robots.txt is the first file every crawler checks.

Copy-paste fix
# https://expressjs.com/robots.txt
User-agent: *
Allow: /

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

Sitemap: https://expressjs.com/sitemap.xml
WARN crawl.sitemap.discoverable ~20m score: 50

Sitemap referenced in robots but file returned 4xx — deploy it

Fix: Deploy a valid sitemap.xml at the referenced URL

Why: Without a sitemap, multi-page agents can't enumerate your site reliably.

Copy-paste fix
# Add to robots.txt:
Sitemap: https://expressjs.com/sitemap.xml
FAIL crawl.sitemap.parseable ~30m score: 0

Sitemap has 0 parseable URLs or is absent

Fix: Ensure sitemap.xml is valid XML with at least one <loc>

Why: Empty/invalid sitemaps look the same to a crawler as none at all.

Copy-paste fix
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url><loc>https://expressjs.com/</loc></url>
  <!-- add one <url><loc>…</loc></url> per public page -->
</urlset>
INFO crawl.x-robots-header ~10m score: 100

No X-Robots-Tag or robots meta tag detected

Why: X-Robots-Tag: noindex tells crawlers (and AI agents) to skip the page. noai and none are increasingly used to opt out of AI training — verify this is intentional.

Structured Data (6)
FAIL sd.jsonld.count ~15m score: 0

No JSON-LD blocks found

Fix: Add at least one <script type="application/ld+json"> block

Why: JSON-LD is how LLMs reliably extract entity facts (org, products, articles) without parsing prose.

Copy-paste fix
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your page title",
  "url": "https://expressjs.com"
}
</script>
FAIL sd.jsonld.entity-types ~20m score: 0

No valid JSON-LD blocks to check for entity types

Fix: Add Organization, WebSite, or Article/Product/Service JSON-LD

Why: Generic JSON-LD without recognised @type doesn't disambiguate the entity for agents.

Copy-paste fix
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "name": "Your page title",
  "url": "https://expressjs.com"
}
</script>
INFO sd.jsonld.valid ~10m score: 100

No JSON-LD blocks to validate (skipped)

Why: Invalid JSON-LD silently breaks structured-data parsers — they just skip the block.

FAIL sd.opengraph.complete ~10m score: 0

Open Graph missing: title, description, type, image, url

Fix: Add missing og: meta tags: og:title, og:description, og:type, og:image, og:url

Why: OpenGraph drives link previews everywhere (Slack, iMessage, agent UIs that render share cards).

Copy-paste fix
<meta property="og:title" content="Your page title">
<meta property="og:description" content="One-sentence summary of what this page is — 50–160 chars, written for both humans and LLM agents.">
<meta property="og:type" content="website">
<meta property="og:url" content="https://expressjs.com">
<meta property="og:image" content="https://expressjs.com/og-image.png">
FAIL sd.title-meta.quality ~5m score: 0

Issues: title missing; meta description missing

Fix: Title should be 10–60 chars; meta description should be 50–160 chars

Why: <title> + meta description are the snippet agents quote when summarising or citing your page.

Copy-paste fix
<title>Your page title</title>
<meta name="description" content="One-sentence summary of what this page is — 50–160 chars, written for both humans and LLM agents.">
FAIL sd.twitter-card.present ~2m score: 0

No twitter:card meta tag found

Fix: Add <meta name="twitter:card" content="summary_large_image">

Why: twitter:card extends OpenGraph with X/Twitter-specific layout hints.

Copy-paste fix
<meta name="twitter:card" content="summary_large_image">
Semantic HTML (5)
FAIL html.heading-hierarchy ~30m score: 0

No heading elements found

Fix: Add an <h1> and a logical heading hierarchy

Why: Headings give agents the document outline. No h1 = no clear topic.

Copy-paste fix
<!-- Document outline -->
<h1>Page topic in one sentence</h1>
<section>
  <h2>Section heading</h2>
  <h3>Sub-section</h3>
</section>
INFO html.img-alt-coverage ~30m score: 100

no <img> elements found, rule skipped

Why: Alt text is how agents (and screen readers) caption images. Missing alt = invisible content.

Copy-paste fix
<img src="/diagram.png" alt="Concise description of what the diagram shows">
FAIL html.landmarks ~30m score: 0

No semantic landmark elements found

Fix: Add missing landmark elements: main, nav, header, footer, article

Why: Landmark elements let agents skip nav/footer chrome and zero in on the main content.

Copy-paste fix
<header>
  <nav>…site nav…</nav>
</header>
<main>
  <article>…page content…</article>
</main>
<footer>…site footer…</footer>
FAIL html.lang-attr ~1m score: 0

<html> element is missing a lang attribute

Fix: Add lang="en" (or appropriate language code) to the <html> element

Why: Without lang, agents can't decide whether to translate or which language model to apply.

Copy-paste fix
<html lang="en">
INFO html.link-text-quality ~20m score: 100

no <a> elements found, rule skipped

Why: "click here" / "learn more" tells agents nothing about the destination. Use descriptive link text.

Content Clarity (6)
INFO content.action-surface ~0m score: 100

Action surface: 0 entry points — 0 form(s), 0 button(s), 0 mailto, 0 tel, 0 ARIA button(s)

Why: Forms, CTA buttons, and contact links are conversion entry points. Agent-driven buyers (using AI to evaluate vendors) look for these signals when deciding whether a site is a real business.

FAIL content.boilerplate-ratio ~90m score: 0

Readability extraction returned null — cannot compute boilerplate ratio

Fix: Ensure page has extractable body content

Why: Lots of nav/footer chrome relative to page body dilutes the signal in agent excerpts.

FAIL content.readability-extract ~60m score: 0

Readability extraction returned null — page may be empty or unparseable

Fix: Ensure the page has substantive body text (≥250 chars) that readability can extract

Why: If readability extraction returns null, agents likely can't isolate the article body.

INFO content.readability-signal ~0m score: 100

Mozilla Readability returned null — page is not extractable as an article

Why: Mozilla Readability is the same library Firefox Reader View uses; if it can't extract your content, neither can most third-party article scrapers.

PASS content.script-density ~60m score: 100

1 script/iframe elements (1 <script>, 0 cross-origin <iframe>)

Why: Heavy client-side script density is a red flag that content depends on JS to render.

FAIL content.signal-noise ~90m score: 0

Signal-noise ratio: 0.0% (0 text bytes / 561 total bytes)

Fix: Reduce script/style bloat or increase substantive text content

Why: High script/style-to-text ratio buries the actual content agents are trying to extract.

Determinism (3)
PASS det.bot-cloaking ~60m score: 100

gptbot and browser fetches returned identical content

Why: Different content to GPTBot vs a browser is the textbook cloaking pattern. Search engines flag this.

PASS det.fetch-stability ~60m score: 100

Two luxfaber fetches returned identical body hash

Why: If the same URL serves different bytes on every fetch, no agent can cache/cite reliably.

PASS det.ua-cloaking ~60m score: 100

luxfaber and browser fetches returned identical content

Why: Serving different content to a Luxfaber UA vs a browser UA = SEO/agent cloaking risk.

Metadata

Luxfaber version 0.2.0
Ruleset v1.1
Node v20.19.6
Primary UA Luxfaber/0.1 (+https://github.com/Stelnyx/LuxFaber)
Fetch timeout 15000ms
Schema version 1.0.0 }
Upgrade · Standard / Enterprise

Want the full picture?

This recon scan covers a single page. Higher tiers add:

  • Standard: multi-page site crawl + per-page health table + priority plan (P1–P5) + 3-week roadmap
  • Enterprise: standard + branded cover (logo, client, accent color) + 1-hour walkthrough with a senior engineer · $499 flat

View tiers · book the audit