How to Implement EntityMap

Use the reference prompt to generate your EntityMap with your preferred LLM. The prompt and its accompanying examples are maintained in the GitHub repo.

A well-scoped site with 10–30 entities can be drafted in a few hours. Paste a page of your content into your LLM with the prompt and ask it to produce conforming entity objects. The steps below walk through the structure if you prefer to author by hand or want to understand what the prompt produces.

Write the root object

{
  "version": "1.0",
  "schema": "https://entitymap.org/spec/v1.0",
  "publisher": {
    "name": "Your Site Name",
    "url": "https://yoursite.com",
    "sameAs": "https://www.wikidata.org/wiki/Q..."
  },
  "generated": "2026-04-07T00:00:00Z",
  "entities": []
}

The publisher.name value must match the publisher field on every chunk exactly. This is the primary attribution mechanism — treat it as a constant.

Add entity objects

For each entity, follow this structure:

{
  "entityId": "e_001",
  "@type": "Concept",
  "name": "Your Entity Name",
  "alternateName": "Abbreviation if any",
  "description": "1–3 sentences defining this concept
    as your site uses it. Be specific to your context.",
  "sameAs": "https://www.wikidata.org/wiki/Q...",
  "relations": [],
  "hasChunks": []
}

Field	Req	Notes
`entityId`	MUST	Stable. Use a simple prefix: e_001, e_002. Never reuse a retired ID.
`@type`	MUST	An EntityMap core type. General concepts use `Concept`; publisher-coined terms use `ProprietaryTerm`. See §4 for the full list.
`name`	MUST	Your site's label — not a generic Wikipedia title.
`description`	MUST	Your definition. SHOULD be extractive from your own content.
`sameAs`	SHOULD	Wikidata URI if one exists. Links your entity to the open knowledge graph.
`alternateName`	MAY	Abbreviation or common variant. Key for disambiguation.

Add chunks

Each entity needs 1–5 evidence chunks. Select your best, most specific passages — not introductory sentences. Each chunk must carry the publisher name.

"hasChunks": [
  {
    "chunkId": "c_001",
    "text": "A 1–5 sentence passage from your content.
      Max 600 characters. Extractive preferred.",
    "sourceUrl": "https://yoursite.com/page-url",
    "pageTitle": "Title of the source page",
    "publisher": "Your Site Name",
    "retrieved": "2026-03-27T09:00:00Z",
    "relevanceScore": 0.92
  }
]

The publisher field on every chunk MUST exactly match publisher.name in the root object. It is the field that carries your brand attribution through downstream AI aggregation.

Add relations

Relations are optional but valuable. Even a sparse relation graph significantly improves how AI systems traverse your knowledge. Use predicates from the standard vocabulary.

"relations": [
  {
    "predicate": "ENABLES",
    "targetId": "e_004",
    "targetName": "AI Share of Voice"
  },
  {
    "predicate": "INSTANCE_OF",
    "targetUri": "https://www.wikidata.org/wiki/Q1163385",
    "targetName": "Herfindahl-Hirschman Index"
  }
]

For internal targets (entities in your own EntityMap), use targetId. For external concepts, use targetUri pointing to Wikidata or schema.org. targetName is required in both cases.

Generate the HTML companion

The entitymap.html file is a crawlable, human-readable rendering of the same data. It MUST NOT be maintained separately from the JSON.

The HTML file must embed per-entity JSON-LD, render relations as internal hyperlinks, and carry data-publisher attributes on every chunk blockquote.

Option B — Waikay generator

The reference implementation extracts entities, selects evidence chunks, and generates both files automatically. Requires creating a free account and project at waikay.io/entitymap.

Review and export

Review the extracted entities and evidence. Adjust descriptions, prune low-relevance chunks, and add any missing relations. Export both files — then deploy and add discovery hints as below.

Generator output MUST be published with verificationStatus: "generator-draft" unless you have reviewed and approved every entity and relation. The confidence: "declared" designation and the ProprietaryTerm type require explicit human review.

Deploy and discover

Add entitymap.html to sitemap.xml

List entitymap.html in your sitemap with priority: 0.9 and changefreq: weekly. This signals freshness to crawlers and surfaces the file to AI systems that follow sitemaps.

<url>
  <loc>https://yourdomain.com/entitymap.html</loc>
  <priority>0.9</priority>
  <changefreq>weekly</changefreq>
</url>

Link to entitymap.html from your site footer

Add a visible hyperlink to entitymap.html in the footer of your home page — or better, in a sitewide footer so it appears on every page. This is the most reliable discovery mechanism available today, because every AI crawler that follows HTML links will find it without requiring any special standard support.

<footer>
  <a href="https://yourdomain.com/entitymap.html">EntityMap</a>
</footer>

Use entitymap.html as the link target — not the JSON file. The HTML version is designed for crawlers: it renders entity definitions, relations, and attribution as readable text with embedded JSON-LD, so any system that fetches and parses HTML will extract structured, publisher-attributed content.

For the JSON file, use the machine-readable <link> tag in the <head> (step 3 above) rather than a visible footer link.

Pre-publish checklist

✓Valid JSON — parseable without error

✓version, schema, publisher, generated, entities present at root

✓Every entity has entityId, @type, name, description, and at least one chunk

✓publisher field on every chunk matches publisher.name exactly

✓No entity has more than 5 chunks

✓All internal targetId values resolve to a valid entityId in this file

✓All predicates are from the standard vocabulary or a declared custom vocabulary

✓Tier 3 predicates (IMPROVES, DEGRADES, LEADS_TO, SUITED_FOR, TARGETS, ACHIEVES) have a confidence field

✓Both files accessible at root without authentication

✓entitymap.html does not carry a noindex directive

✓Discovery hints added to robots.txt, HTML <head>, and sitemap.xml

Common mistakes

Publisher name mismatch. The most frequent error. "publisher": "Acme Corp" in the root and "publisher": "Acme" on a chunk will fail validation. Copy-paste the value, don't retype it.

Stale generated timestamp. The generated field must update on every rebuild. A file with a timestamp months old signals to consumers that the EntityMap is unmaintained.

Too many entities, too few chunks. It is better to have 15 well-evidenced entities than 80 with one weak chunk each. Depth signals authority. Breadth without evidence does not.

Generic descriptions. The description field should define the concept as your site uses it — not a Wikipedia summary. "AI Share of Voice is a metric that measures the proportion of AI-generated answers in which a brand appears" is specific. "Share of voice is a marketing metric" is not.

Maintaining JSON and HTML separately. The HTML file must be generated from the JSON. If you edit one manually, they will diverge. Treat the JSON as the source of truth at all times.

Building your own generator? See the spec and JSON schema on GitHub. Conforming implementations will be listed in the implementations registry as they appear.

How to implement EntityMap

Option A — Use the prompt

Option B — Waikay generator

Option A — Use the prompt

List your entities

Write the root object

Add entity objects

Add chunks

Add relations

Generate the HTML companion

Option B — Waikay generator

Create an account and project

Connect your site

Review and export

Deploy and discover

Serve both files at root

Add robots.txt hint

Add HTML head link to every page

Add entitymap.html to sitemap.xml

Link to entitymap.html from your site footer

Pre-publish checklist

Common mistakes