Markdown & Front Matter — The Language AI Reads

00 / Why this matters

The format the whole
ecosystem converged on.

If you've worked with any modern documentation system — a developer blog built on Hugo, a knowledge base in Obsidian, a project using Claude Code, or an AI agent framework — you've almost certainly encountered the same pattern: a plain-text file with a block of metadata at the top, followed by content written in Markdown.

This isn't a coincidence. Over the past decade, the combination of Markdown (a lightweight formatting syntax) and front matter (structured metadata embedded at the top of the file) has become the de facto standard for documentation across the entire software ecosystem. GitHub uses it. Jekyll, Hugo, Docusaurus, and Astro use it. Obsidian uses it. Claude Code's CLAUDE.md uses it. The Claude Agent SDK's SKILL.md files use it.

⚡ Why Now

The convergence accelerated with AI. Large language models are trained primarily on plain text and Markdown — they understand its structure natively. When you write documentation in Markdown with structured front matter, you're writing in a format that humans can read in any text editor, Git can version-control cleanly, static site generators can render to HTML, and AI systems can parse without any pre-processing. One format, the entire toolchain.

By the end of this guide you'll understand what each part does, why the combination works so well, what front matter fields matter most for AI-readable documentation, and the conventions that make your Markdown actually useful to the tools consuming it.

01 / Markdown

Plain text with
just enough structure.

Markdown is a lightweight markup language designed around one idea: a document should be readable as plain text and renderable as rich HTML without changing the source. You've almost certainly read Markdown without knowing it — every GitHub README, every developer blog post, every Stack Overflow answer is written in it.

The syntax stays out of your way. Headings use # symbols. Bold uses **double asterisks**. Links are [text](url). Code gets backticks. That's the core of it.

Example — Markdown source vs. what you see

readme.md — source
# Getting Started
This guide covers three things you need to know.
## Installation
Run the following command in your terminal:
```bash
npm install my-tool
```
See the [full docs](https://example.com/docs) for more options.

This renders as a heading, a paragraph with bold text, a subheading, a syntax-highlighted code block, and a hyperlink. The source is completely readable even without rendering.

Why plain text wins

The decision to build documentation on plain text rather than binary formats (Word, PDF, rich text) has compounding advantages that become clear when you're working at scale:

✅ Plain text advantages

Version-controlled cleanly — diffs show actual changes
Opens in any editor, on any platform
No proprietary format lock-in
Trivially searchable with standard tools
LLMs trained on it — natively understood
Works offline, no internet required

✗ Binary format problems

Git diffs show only "file changed" — not what
Requires specific software to open
Embedded metadata is opaque to AI
Merging conflicts are destructive
Pre-processing needed before AI can read
Format lock-in, versioning nightmares

→ The Git angle

On Windows and Mac, when you git diff two versions of a Markdown file, you see exactly which words changed. Try that with a .docx file — Git sees it as a binary blob and shows only "Binary files differ." Markdown's plain-text nature is what makes documentation version control actually useful.

02 / Front matter

Metadata at the
top of the file.

Front matter is a block of structured metadata placed at the very top of a Markdown file, enclosed between two lines of triple dashes (---). Everything between those dashes is metadata about the document — its title, date, tags, author, skill level, or anything else that helps systems categorise and process it. Everything after the closing --- is the document's actual content.

💡 The Mental Model

Think of front matter as the label on the outside of a manila folder. Before you open the folder to read what's inside, the label tells you the title, the date it was filed, what project it belongs to, and how urgent it is. Front matter does the same thing — it tells any tool consuming the file what it's dealing with before reading the content.

Example — A complete Markdown file with front matter

tutorial.md
--- title: "Git Branches, Commits and PRs" description: "A complete guide to Git version control" date: 2026-03-11 author: "Technoobtopia" skill_level: S1 length: L2 tags: - git - version-control - beginner platform: "windows, mac" draft: false --- # Git Branches, Commits and PRs
Version control is how developers save their work, collaborate without
stepping on each other, and roll back mistakes...

The --- delimiters tell parsers where the metadata ends and the content begins. The metadata is written in YAML — a simple key: value format. The content is standard Markdown.

What is YAML?

The metadata block is written in YAML (pronounced "yam-ul") — a human-readable data format designed to be as close to plain English as possible. The rules are minimal:

🔑

key: value

The basic building block. Key on the left, colon, space, value on the right. title: "My Doc"

📋

Lists

A list of values uses a dash on a new line, indented two spaces. tags: then - item on the next line.

✅

Booleans

true / false without quotes. Used for flags like draft: false or published: true.

📅

Dates

ISO 8601 format without quotes: date: 2026-03-11. Parsers recognise this as a date automatically.

⚠ The one gotcha — indentation

YAML uses spaces, not tabs for indentation. A tab character will break your front matter silently — the parser will either error or ignore the field. Use two spaces for list items and nested keys. Most code editors (VS Code, etc.) can be set to insert spaces when you press Tab.

03 / AI & documentation

Why the AI ecosystem
landed here.

The adoption of Markdown + front matter as the standard for AI-adjacent documentation wasn't a committee decision — it emerged from the practical reality of how language models and agentic tools process information.

"An AI that has to parse a Word document, strip formatting, guess what's a heading vs. body text, and extract metadata from prose — is already behind. A Markdown file with front matter gives it structure for free."

— A pattern visible across every major AI tooling framework, 2023–2026

Five reasons this format dominates

1 — LLMs are trained on it

The vast majority of public text on the internet — GitHub, developer blogs, Stack Overflow, Wikipedia — is either plain text or Markdown. LLMs learned Markdown's structure during pre-training. When an AI encounters a # heading or a ``` code fence, it understands the hierarchy without any special handling.

2 — Front matter removes ambiguity

Natural language is ambiguous. "This document is about machine learning for beginners published in March" lives in prose and requires the AI to parse it. Front matter puts that information in a structured, machine-queryable form: topic: machine-learning, skill_level: S1, date: 2026-03. No parsing required — just key lookup.

3 — RAG pipelines use it for filtering

Retrieval-Augmented Generation (RAG) systems retrieve relevant documents before asking an AI to answer a question. Front matter fields become filterable metadata: "find all S1 tutorials tagged git published after 2025." Without structured metadata, this requires the model to read every document to determine relevance.

4 — Agent frameworks expect it

Claude Code looks for CLAUDE.md in a project directory and reads it as context. The Claude Agent SDK's SKILL.md files use front matter to declare a skill's name, description, and trigger conditions. GitHub Copilot reads .github/copilot-instructions.md. These aren't coincidences — agent tools consistently reach for Markdown + front matter as their configuration and context format.

5 — The whole toolchain speaks it

Markdown + front matter works simultaneously as: a human-readable file in any editor, a version-controlled document in Git, a page in a static site (Jekyll, Hugo, Docusaurus, Astro), a note in Obsidian, and context for an AI tool. No conversion needed between uses.

The ecosystem at a glance

Claude Code · CLAUDE.md Agent SDK · SKILL.md GitHub READMEs GitHub Pages (Jekyll) Hugo Docusaurus Astro Obsidian Notion (export) MkDocs GitLab Docs Copilot Instructions

⚡ The CLAUDE.md Example

When you run Claude Code in a project directory, one of the first things it does is look for a CLAUDE.md file. If found, it reads that file and uses it as persistent context for the entire session — understanding your project's conventions, tech stack, and preferences. That file is plain Markdown, and its front matter (if present) can signal things like the project type, relevant tools, and what Claude should and shouldn't do. The format was chosen specifically because it requires zero pre-processing and works directly as AI input.

04 / Front matter fields

The fields that
actually matter.

There's no universal standard for front matter fields — different systems define their own. But a set of fields has converged across most documentation systems, and knowing them lets you write front matter that works across multiple tools simultaneously.

Universal fields

These are recognised by virtually every static site generator, documentation tool, and AI framework:

title

string

The document's display title. Always include this. Used as the page title, document heading, search index entry, and AI context. Should be specific and descriptive.

description

string

A 1–2 sentence summary. Used in SEO meta tags, search result snippets, and — critically — as the field AI tools use to decide whether this document is relevant to a query. Write it for a machine, not just a human.

date

date (YYYY-MM-DD)

Publication or last-updated date. Use ISO 8601 format without quotes: 2026-03-11. Essential for sorting, filtering by recency in RAG systems, and signalling AI tools about the currency of the content.

AI-specific fields

These fields are increasingly common in documentation written primarily for AI consumption or for systems like Technoobtopia that use structured skill/length metadata:

skill_level

enum: S1 | S2 | S3

The intended audience's knowledge level. Allows AI systems to surface the right version of a topic for a given user, and signals the depth and assumed knowledge of the document.

length

enum: L1 | L2

The scope of the tutorial. L1 = single concept, single page. L2 = comprehensive, multi-section. Allows tools to retrieve the right level of depth for a user's context and time constraints.

platform

string

Which platforms the content covers: "windows", "mac", or "windows, mac". Enables platform-targeted retrieval — useful when a user on Windows should be served Windows-specific content.

list of strings

Slugs or titles of related documents. Helps AI systems build knowledge graphs and suggest next steps. Mirrors how a human editor would cross-reference content.

prerequisites

list of strings

What the reader should already know or have completed. AI tutoring systems use this to sequence learning paths — don't show an S3 Git tutorial to someone who hasn't covered S1 basics.

Example — A real-world CLAUDE.md front matter block

CLAUDE.md
--- project: "technoobtopia" description: "Educational platform for AI-era beginners. Tutorials, quizzes, and docs built to the S1/S2/S3 × L1/L2 spec." stack: - Astro 5 - HTML/CSS/JS (no framework in tutorial files) - GitHub Pages / Netlify conventions: "See Philosophy_Style_Guide.md" do_not: - Add external CSS frameworks to tutorial files - Use jQuery --- # Project Context for Claude ... 

Claude Code reads this file at session start and uses the structured metadata to understand the project before it reads a single line of source code. The description, stack, and do_not fields give it immediate, unambiguous context.

05 / Markdown conventions

Writing Markdown
AI can actually use.

Not all Markdown is equally useful to AI tools. A document that renders beautifully in a browser can still be ambiguous or poorly structured from a machine's perspective. These conventions are the difference between Markdown that works and Markdown that works everywhere.

Use one H1 per file — and make it match the title field

Your document should have exactly one # H1 heading, and it should match (or closely reflect) the title in your front matter. AI tools and static site generators use the H1 as the canonical document title when the front matter title is absent. Multiple H1s signal structural ambiguity.

Don't skip heading levels

Go # → ## → ### in order. Never jump from # to ###. Heading hierarchy is how AI systems understand document structure and how screen readers navigate. A skipped level breaks both. Think of headings as an outline — you wouldn't jump from chapter to sub-subsection.

Always specify the language on code fences

A code block opened with ```bash or ```python is far more useful than a bare ```. The language specifier enables syntax highlighting in renderers, signals to AI tools what kind of code this is, and often enables platform-specific handling (e.g., a Windows-aware tool might add a PowerShell note to a bash block).

```bash # ✅ tells the tool: this is shell script
git commit -m "add front matter"
``` ``` # ✗ ambiguous — shell? Python? YAML?
git commit -m "add front matter"
```

Write descriptive link text — never "click here"

[click here](https://docs.example.com) is useless to a screen reader and gives an AI tool no information about what the link points to. [front matter documentation](https://docs.example.com) tells both humans and machines exactly what they'll find. AI systems use link text to build knowledge graphs between documents.

Use consistent tag formatting across all files

Tags are only useful for filtering if they're consistent. Decide on a convention — lowercase hyphenated (version-control, getting-started) — and apply it everywhere. Mixed conventions (Version Control, versioncontrol, version-control) mean three tags that should be one, and a RAG filter that misses two-thirds of relevant documents.

Platform-specific content: use labels, not assumptions

When commands differ between Mac and Windows, label each explicitly rather than assuming a platform. A Markdown block with a clear label (Mac: / Windows:) is unambiguous to both a human reader and an AI tool trying to serve platform-appropriate responses. Many documentation systems also support custom callout syntax (::: note[Windows]) for this purpose.

Keep front matter fields consistent across a collection

If five tutorials use skill_level and one uses difficulty, the odd one out is invisible to any filter or AI query looking for skill_level. Consistency of field names across a collection is what turns a pile of Markdown files into a queryable knowledge base. Define your fields once (in a style guide!) and apply them everywhere.

→ VS Code tip (Windows & Mac)

Install the Prettier extension and the YAML extension in VS Code. Prettier will auto-format your Markdown on save. The YAML extension validates your front matter blocks and highlights errors before they cause problems downstream. Both work identically on Windows and Mac.

06 / Cheatsheet

Quick reference.

Everything in one place — copy the front matter template, then refer to the syntax and conventions as needed.

Front matter template

Copy & adapt this
--- title: "Your document title" description: "1-2 sentences. Write for machines — be specific." date: 2026-03-11 # ISO 8601, no quotes author: "Your name or org" tags: # lowercase, hyphenated - topic-one - topic-two skill_level: S1 # S1 | S2 | S3 length: L2 # L1 | L2 platform: "windows, mac" draft: false # true while in progress related: - "related-doc-slug" ---

Markdown syntax

# H1 ## H2 ### H3

Heading hierarchy — never skip levels

**bold** *italic* ~~strike~~

Inline text emphasis

`inline code`

Inline code — command names, filenames, keys

```bash code here ```

Code block — always specify the language

[link text](https://url.com)

Link — text must describe the destination

![alt text](image.png)

Image — alt text is required for AI + accessibility

- item 1. item

Unordered list / Ordered list

> blockquote text

Pull quote or callout

--- (three dashes on own line)

Horizontal rule / section break

YAML rules

key: value

Basic key-value pair — space after colon is required

key: "string with spaces"

Strings with spaces or special chars need quotes

key: true / false

Booleans — no quotes

date: 2026-03-11

Dates in ISO 8601 — no quotes

list:
- item one
- item two

List — two-space indent, dash + space before each item

Use spaces, NOT tabs

Tabs break YAML silently — always use spaces

💡 The core mental model

Front matter is the label on the folder. Markdown is the contents inside. The triple-dash delimiters are the folder cover itself. Together they create a document that humans can read, Git can track, websites can render, and AI can query — without any conversion step in between. Write front matter for machines. Write Markdown for humans. Both benefit from the same file.

The language
AI reads.

The format the whole
ecosystem converged on.

Plain text with
just enough structure.

Why plain text wins

✅ Plain text advantages

✗ Binary format problems

Metadata at the
top of the file.

What is YAML?

Why the AI ecosystem
landed here.

Five reasons this format dominates

The ecosystem at a glance

The fields that
actually matter.

Universal fields

AI-specific fields

Writing Markdown
AI can actually use.

Quick reference.

What comes next

MDX — Markdown Meets Components

JSON Schema & Validation

Environment Variables & Secrets

The languageAI reads.

The format the wholeecosystem converged on.

Plain text withjust enough structure.

Why plain text wins

✅ Plain text advantages

✗ Binary format problems

Metadata at thetop of the file.

What is YAML?

Why the AI ecosystemlanded here.

Five reasons this format dominates

The ecosystem at a glance

The fields thatactually matter.

Universal fields

AI-specific fields

Writing MarkdownAI can actually use.

Quick reference.

What comes next

MDX — Markdown Meets Components

JSON Schema & Validation

Environment Variables & Secrets

The language
AI reads.

The format the whole
ecosystem converged on.

Plain text with
just enough structure.

Metadata at the
top of the file.

Why the AI ecosystem
landed here.

The fields that
actually matter.

Writing Markdown
AI can actually use.