You know the syntax. Now learn why it behaves the way it does — including the type-coercion quirks, multiline string variants, and production patterns that catch every intermediate developer eventually.
YAML has three building blocks — scalars (single values), sequences (lists), and mappings (key-value pairs) — and supports two presentation styles (block and flow). The parser infers types from unquoted values, which means NO, yes, 2024-01-15, and 042 can all surprise you unless you understand the resolution rules.
The big gotchas: YAML 1.1 treats yes/no/on/off as booleans (the Norway Problem). Dates auto-parse. Tabs are illegal for indentation. Duplicate keys are undefined behaviour. Multiline strings come in two forms — literal | (preserves newlines) and folded > (collapses newlines to spaces).
In production: Astro uses js-yaml (YAML 1.2) with optional Zod schema validation. Jekyll uses Ruby's Psych (YAML 1.1). Hugo defaults to TOML. When portability and explicitness matter more than terseness, reach for TOML.
Front matter isn't just Markdown decoration. It's the data layer that drives every modern static site generator, content pipeline, and AI workflow. Astro, Hugo, Jekyll, and Obsidian all parse it. GitHub Actions workflows, Docker Compose files, and Kubernetes manifests share its grammar. Understanding YAML at the parser level — not just the syntax level — means you stop guessing and start reasoning.
The edge cases in this guide aren't rare. They're the specific situations that show up when your content pipeline moves to production, when a country code becomes false, when a version number silently drops its leading zero. You've probably already hit at least one.
YAML is a serialization language with three fundamental node types and two presentation styles. Once you have this mental model, every quirky behavior has a logical explanation.
A string, number, boolean, null, or date. Everything that isn't a container is a scalar.
A list of nodes. Each item can itself be a scalar, sequence, or mapping.
An unordered set of key-value pairs. Keys are almost always scalars. Values can be anything.
Every YAML node can be written in one of two styles. Block style uses indentation and newlines — it's what you write in front matter. Flow style uses inline JSON-like syntax. Both are legal YAML; they represent the same data.
--- title: My Post tags: - astro - yaml author: name: Alice role: editor ---
--- title: My Post tags: [astro, yaml] author: {name: Alice, role: editor} ---
YAML is a superset of JSON. Any valid JSON document is also valid YAML. Flow style is essentially JSON without the mandatory quoting. This is why your front matter parser can usually handle both.
When the parser encounters an unquoted scalar, it runs it through a type resolution sequence. The first type whose pattern matches wins:
# The parser tries each of these in sequence: null ← matches: null, ~, or empty value boolean ← matches: true, false (YAML 1.2) also: yes, no, on, off (YAML 1.1 only) timestamp ← matches: 2024-01-15, 2024-01-15T10:00:00Z integer ← matches: 42, 0xFF, 0o77 float ← matches: 3.14, .inf, -.inf, .nan string ← everything else falls through to string
This is exactly why quoting matters. A value that looks like a string to you might pattern-match a type higher in this list. The parser doesn't read your mind — it reads the value.
A front matter parser first strips the --- delimiters, then hands the content to a YAML parser. The YAML parser has no knowledge of Markdown. It sees a standalone document. This means the indentation-sensitive rules, type resolution, and all YAML restrictions apply in full — including the tab-indent prohibition.
The basic types — string, number, boolean, null — all behave intuitively most of the time. Then they don't. Here's exactly where the surprises live, and why they exist.
Safe for simple prose. Dangerous when the value could match a type higher in the resolution list, or contains special characters (:, #, [, {).
Forces string type. No escape sequences. To include a literal single quote, double it: ''. Use when you don't need escape sequences and want guaranteed string parsing.
Forces string type and supports escape sequences: \n, \t, \\, \", \uXXXX for Unicode. The most powerful option.
This is one of the most notorious YAML gotchas. In YAML 1.1, the following values all parse as booleans: yes, no, true, false, on, off, y, n, and their capitalised variants. The ISO 3166-1 country code for Norway is NO. A config file with country: NO silently becomes country: false.
# YAML 1.1 — used by Ruby/Psych (Jekyll), PyYAML older versions country: NO ← parses as false (!!) enabled: yes ← parses as true mode: on ← parses as true debug: off ← parses as false # YAML 1.2 — used by js-yaml (Astro), Go's yaml.v3 country: NO ← parses as string "NO" ✓ enabled: yes ← parses as string "yes" (not boolean!) mode: true ← parses as true ✓ # Safe: always quote if the value could be ambiguous country: 'NO' ← string in any YAML version
Jekyll uses Ruby's Psych, which implements YAML 1.1. Astro uses js-yaml (v4+), which implements YAML 1.2. The same front matter file can produce different values depending on which tool reads it. When content travels between tools — export from Jekyll, import to Astro — this bites.
When a description or body text needs line breaks, YAML offers two distinct block scalar styles with different semantics. Choosing the wrong one is a common source of subtle rendering bugs.
|description: | This is line one. This is line two. This is line three. # Result (newlines preserved): # "This is line one.\n # This is line two.\n # This is line three.\n"
>description: > This is line one. This is line two. This is line three. # Result (newlines → spaces): # "This is line one. # This is line two. # This is line three.\n"
Both styles also support chomping indicators — characters that control the trailing newline behaviour:
a: | ← clip (default): single trailing newline text b: |- ← strip: no trailing newline text c: |+ ← keep: all trailing newlines preserved text
# Integers count: 42 ← integer hex: 0xFF ← 255 octal: 0o77 ← 63 (YAML 1.2 syntax) old_oct: 077 ← 63 in YAML 1.1, string "077" in 1.2! # Floats ratio: 3.14 big: .inf ← positive infinity small: -.inf ← negative infinity undef: .nan ← Not a Number # Dates — YAML parses these automatically published: 2024-01-15 ← Date object, not a string! updated: 2024-01-15T10:30:00Z ← full ISO 8601 timestamp safe_date: "2024-01-15" ← force string with quotes
Astro's content collections accept pubDate as a z.date() in Zod schema, which means the automatic date parsing actually helps you. But if you're reading the front matter raw and expecting a string, you'll get a Date object instead. Define your Zod schema explicitly — then both the type and the behaviour are under your control.
Real-world front matter is rarely flat. Once you need to model relationships — an author with a name and a role, a list of links each with a URL and label — you need to understand how YAML's nesting rules actually work.
Unlike Python (which uses indentation for code blocks), YAML's indentation rules are strictly defined: each nesting level must use more spaces than its parent. Two spaces is the convention; four works too. But the number must be consistent within a document. Mix and match and the parser will reject it or silently misparse.
--- title: Deep Dive # Nested mapping: author is an object author: name: Alice Chen role: senior editor social: github: alicechen twitter: alicec # Sequence of scalars: simple list tags: - yaml - front-matter - intermediate # Sequence of mappings: list of objects links: - title: Documentation url: https://yaml.org - title: YAML Spec url: https://yaml.org/spec/1.2 ---
YAML explicitly prohibits tab characters for indentation. The spec says so unambiguously, and every major parser enforces it. Editors that auto-convert tabs to spaces mask this. Editors that don't will cause silent parse failures. If your YAML parser errors with "mapping values are not allowed here" or similar, the first thing to check is tab characters.
When the same data appears in multiple places, YAML gives you a way to define it once and reference it elsewhere. This is less common in front matter than in full YAML documents, but you'll see it in GitHub Actions workflows, Docker Compose files, and complex Hugo configurations.
# & defines an anchor — names this node for reuse defaults: &post_defaults layout: post draft: false author: Alice # * is an alias — inserts the anchored value here published_post: <<: *post_defaults ← merge key: inherits all defaults title: Overrides title, keeps everything else # Result: published_post has layout, draft, author, AND title # Without merge key — direct scalar alias name: &the_name Alice display_name: *the_name ← also "Alice"
GitHub Actions workflows use anchors extensively to share steps between jobs. Docker Compose uses the merge key (<<) to share service configuration. If you're only writing Markdown front matter, you'll rarely need this — but when you move into CI/CD or Docker, understanding anchors means you can read and modify these files without guessing.
The YAML specification says that duplicate keys in a mapping are undefined behaviour. In practice, most parsers silently keep the last value — but this is not guaranteed. Linters will flag it; parsers may not. Never rely on key ordering or overriding behaviour.
--- title: First title draft: true title: Second title ← which one wins? --- # js-yaml (Astro): "Second title" (last wins, with warning) # Ruby Psych (Jekyll): "Second title" (last wins, silently) # strictYaml: throws an error (correct behaviour) # yamllint: reports an error ✓
The same front matter block can produce different results depending on the parser your tool uses. Knowing which parser each major tool uses — and what choices it makes — prevents a class of production bugs.
| Tool | Parser | YAML Version | Key behaviour |
|---|---|---|---|
| Astro | js-yaml v4 | 1.2 | Only true/false are booleans. Optional Zod schema coercion on top. |
| Jekyll | Ruby Psych | 1.1 | Yes/no/on/off are booleans. Dates auto-parse. Country code gotcha is live here. |
| Hugo | Go yaml.v3 | 1.2 | YAML supported but TOML (+++) is the preferred default in Hugo projects. |
| Eleventy | js-yaml | 1.2 | Same behaviour as Astro's YAML parsing layer. |
| GitHub Actions | Go yaml.v3 | 1.2 | Full YAML document (no front matter delimiters). Same grammar, different context. |
Astro's content collections let you define a Zod schema that validates and coerces your front matter. This is the production-grade approach — instead of trusting implicit YAML type resolution, you declare exactly what types you expect:
// Define the schema for your blog collection import { defineCollection, z } from 'astro:content'; const blog = defineCollection({ schema: z.object({ title: z.string(), pubDate: z.date(), // coerces "2024-01-15" → Date draft: z.boolean().default(false), tags: z.array(z.string()).optional(), author: z.object({ name: z.string(), role: z.string().optional() }).optional(), }) }); export const collections = { blog };
With a Zod schema, Astro validates your front matter at build time and gives you TypeScript types in your page components. Instead of frontmatter.pubDate being string | Date | null | undefined, it's typed as Date. You shift from runtime guessing to compile-time certainty.
| Format | Delimiter | Type system | Best for |
|---|---|---|---|
| YAML | --- | Implicit (inferred) | Human-authored content. Expressive, but edge cases require care. |
| TOML | +++ | Explicit (declared) | Config files where correctness matters more than terseness. No implicit typing. |
| JSON | {...} | Explicit (declared) | Machine-generated content. Strict spec, no comments, no ambiguity. |
--- title: My Post date: 2024-01-15 draft: false tags: - yaml - tutorial ---
+++ title = "My Post" date = 2024-01-15T00:00:00Z draft = false tags = ["yaml", "tutorial"] +++
TOML's key difference: all strings must be quoted, all booleans are exactly true/false, and dates have a prescribed format. There is no implicit typing — what you write is unambiguously what you get. The tradeoff is verbosity. YAML wins on brevity; TOML wins on predictability.
A condensed reference of everything in this guide — data types, multiline string variants, gotchas, and tool behaviour at a glance.
In YAML 1.1 (Ruby/Psych, older PyYAML), yes, no, on, off, y, n and their case variants are booleans. Country code NO becomes false. Fix: quote any value that could be misread.
YAML explicitly forbids tabs for indentation. Only spaces are legal. This is enforced by all conforming parsers. Configure your editor to use spaces in YAML files.
An unquoted 2024-01-15 is a Date object in most parsers, not a string. If you're reading the value and expecting a string, you'll get a Date. Fix: quote it, or use a Zod schema to control coercion.
Undefined by the spec. Most parsers silently keep the last value. Linters (yamllint) will catch this. Don't rely on key ordering or shadowing behaviour.
YAML 1.1: 077 is octal 63. YAML 1.2: 077 is the integer 77. Use 0o77 (YAML 1.2 syntax) when you mean octal, and it's unambiguous in both versions that support it.
The characters :, #, [, ], {, }, |, >, !, &, * have special meaning in YAML. If any appear in your value at the start or after a space, quote the whole value.
yamllint — CLI linter, catches tabs, duplicate keys, trailing spaces, and more. yaml.online-parser.appspot.com — paste YAML, see the parsed structure instantly. Astro's built-in type checking — run astro check to validate front matter against your Zod schemas.
Natural T3 continuations of this topic — not live yet, but they're coming.
The YAML 1.2 spec's four processing layers, parser divergence, advanced Zod schemas, and typed config alternatives at scale.
The spec that defines what your data is allowed to look like — powers VS Code autocomplete, API contracts, and Astro content schemas.
Why .env files exist, how dotenv works, what secrets managers do, and why you should never hardcode a key.