Enrichment vs. chunking

Published 2026-04-21 · 5 min read

Chunking is a RAG pipeline stage. File enrichment is a file property. The two work at different layers of the stack, and they compose well together.

The question comes up almost every time someone new hears about file enrichment: does it replace chunking? No. They solve different problems at different layers, and a well-built system usually does both.

What chunking is, and why it exists

Chunking is a RAG pipeline stage. When you want retrieval-augmented generation to find relevant content inside a large document, you split the document into smaller pieces — chunks — and embed each piece as a vector. At query time, you embed the user's question, find the nearest chunk vectors, and stuff the matching chunks into the LLM's context window.

Every RAG framework has its own chunking strategy: fixed token counts, paragraph boundaries, semantic similarity, hierarchical trees. None is perfect. Chunking is an intrinsically lossy operation: you turn a document into a bag of fragments, and some meaning is lost at every boundary.

What enrichment is

Enrichment is a file-level operation. You run it once per file; the result — extracted text, document structure, summary, signed checksum — lives inside the file's own metadata. Any downstream tool reading the file gets the result for free.

See What is file enrichment? for the full explanation.

Different layers, different jobs

The cleanest mental model:

Enrichment runs at the file / content layer. It answers “what does this file contain?” — once.
Chunking runs at the RAG pipeline layer. It answers “how should I split this content for retrieval?” — per use case.

Enrichment solves the parsing and understanding problem once, at the source. Chunking solves the retrieval problem at query time. They don't compete.

Enrichment makes chunking easier, not obsolete

When a RAG pipeline encounters a LLMind-enriched PDF, it skips re-parsing. The llmind:text field gives clean extracted text; llmind:structure gives the document's heading and table layout as JSON. The pipeline can still chunk that content for its vector index — and it can chunk it more intelligently because it has the document structure up front.

Put differently: enrichment removes the OCR and parsing stages from your RAG ingest. It does not remove chunking, embedding, or retrieval.

When to use each (and when to use both)

Use chunking when

You're building a RAG pipeline that needs fast semantic retrieval
You have a large corpus and need relevance ranking across many documents
Queries don't map to entire documents — users want passages, not files

Use enrichment when

Files need to be read by multiple AI tools (Claude, ChatGPT, Cursor, NotebookLM, MCP servers)
Large PDFs, scanned documents, or audio transcripts re-parse slowly on every load
You need tamper-evident provenance on document content and semantic metadata
You want files to be natively readable by AI tools without a retrieval layer at all

Use both when

Your RAG pipeline ingests from multiple upstream sources and needs consistent, cached parsing
You want to reduce pipeline cost and latency by removing parse-and-OCR from ingest
Downstream tools outside the RAG pipeline also need to read the files

The practical difference

A RAG system without enrichment re-parses and re-OCRs every file on every ingest. A RAG system with enrichment reads the cached layer and skips straight to chunking and embedding. Same retrieval quality; lower cost, faster ingest, reproducible results.

Try it

pipx install 'llmind-cli[all]'
llmind enrich myfile.pdf

Install the CLI Star on GitHub