Enrichment vs parsing

Published 2026-04-22 · 5 min read

Parsing and enrichment are often mentioned together but they're different layers. Parsing is the extraction step — take a PDF, produce structured text. Enrichment is what happens after — take the structured text and persist it somewhere consumers can read. LLMind lives at the enrichment layer. LlamaParse, Docling, Reducto, Textract, Mistral OCR, Azure Document Intelligence all live at the parsing layer. They're complementary, not competitive.

What parsing does

Parsers convert opaque file bytes into structured data. For PDFs, that means text, tables, figure captions, section hierarchies. For images, OCR plus layout analysis. For audio, transcription. Each parser has its strengths—LlamaParse handles academic papers well, Docling is strong on structured documents, Textract handles tables cleanly, Mistral OCR is affordable for vision tasks.

Picking the right parser is a content-specific decision. A financial PDF with dense tables might favor Textract or Docling. An academic research paper might benefit from LlamaParse's understanding of citations and footnotes. A scanned document with unusual layouts might need Azure Document AI's layout understanding.

The parser's job is to extract. Run the extraction once, get structured output. That output is ephemeral—it lives in memory, gets logged, maybe dumped to a scratch directory. When the next pipeline runs on the same file, the parser runs again, generating the same output from the same bytes. Expensive work, repeated.

What enrichment does

Enrichment takes parser output and puts it somewhere durable. The traditional answer was a sidecar JSON file (the parser output next to the original), an ops database (indexed by path or hash), or a scratch directory full of extracted-content files. All three decay.

Files get mirrored to a new system—the sidecar doesn't follow. Files get renamed—the database lookup breaks. Files get copied to a colleague—the scratch directory is left behind. The parser output drifts away from the file.

LLMind's answer: write parser output into the file's own XMP packet. Not a separate file, not a database—the file's native metadata container. Idempotent, portable, signed. The file carries its own interpretation with it, wherever it goes.

Why "different layer" matters

Users sometimes ask whether LLMind "competes with" LlamaParse or Textract. The honest answer is no—those tools extract; LLMind persists. You're not choosing between them. A sensible pipeline runs a parser first, then runs LLMind on the output.

Think of it like a data warehouse. Your ETL tool (parser) extracts data. A semantic layer (Looker, dbt, Cube) sits on top, turning raw rows into agreed-upon metrics. The semantic layer doesn't replace ETL—it depends on it. Downstream tools consume the semantic layer, not the raw tables.

Same idea with files. Your parser (LlamaParse, Docling, Textract) extracts structure. LLMind enrichment sits on top, turning parsed output into portable, signed, in-file metadata. Downstream tools consume the enriched file, not the ephemeral parser output.

The cost argument

Parsing is expensive. GPU time if you self-host, API cost if you outsource. Re-parsing the same file across pipelines wastes that cost. A 200-page PDF might take eight seconds to parse the first time—that's $0.10 via LlamaParse, $0.50 via a cloud Document AI service, or significant compute if you run your own OCR.

Enrichment saves that cost. Parse once, cache the result in the file, read it forever. The second pipeline that runs on the same file doesn't re-parse—it reads the cached enrichment layer. All downstream consumers get the benefit.

Concrete example

Parse with LlamaParse (or Docling, Reducto, Textract, etc.):

parsed=$(llama-parse paper.pdf)
extracted_text=$(echo "$parsed" | jq -r '.text')
extracted_tables=$(echo "$parsed" | jq -r '.tables')

Enrich: cache the parsed result in the file's XMP packet:

llmind enrich \
  --layer parsed \
  --from-text "$extracted_text" \
  --from-tables "$extracted_tables" \
  paper.pdf

Next pipeline reads the cached layer instead of re-parsing:

parsed_text=$(llmind inspect paper.pdf --layer parsed --text)
parsed_tables=$(llmind inspect paper.pdf --layer parsed --tables)
echo "$parsed_text"

The file travels—to a colleague, to a cloud service, to an archive. The enrichment layer travels with it. No database lookup, no sidecar file, no re-parsing. Every consumer reads the same cached result.

FAQ

Does LLMind replace LlamaParse or Textract?

No. LlamaParse, Docling, Reducto, Textract, and similar tools live at the parsing layer—they extract structure from raw files. LLMind lives at the enrichment layer—it caches the parsed result inside the file for reuse. A sensible pipeline runs a parser first, then runs LLMind on the output, so every downstream consumer reads the cached layer instead of re-parsing.

Why not store the parsed result in an ops database?

Databases are fragile when keyed by file path, filename, or even file hash. When files move (copied, renamed, re-uploaded, mirrored), the metadata in the database drifts away from the file. In-file metadata (XMP embedded) is idempotent: the metadata travels with the file, survives copying and renaming, and works offline. The file is the container; the metadata is part of the file.

What parsing does

What enrichment does

Why "different layer" matters

The cost argument

Concrete example

FAQ

Related

Explore more