---
title: "File enrichment glossary | LLMind"
description: "Definitions for file enrichment, XMP metadata, C2PA, MCP, and every term that matters when you work with LLM-ready files."
url: https://llmind.org/glossary/
source_format: html
---
# File enrichment glossary

Definitions for the terms that matter when you work with LLM-ready files — file enrichment, XMP, C2PA, MCP, tamper-evident metadata, and the rest.

-   [**LLM-ready file**](https://llmind.org/glossary/llm-ready-file/) — A file whose metadata carries a structured, AI-readable semantic layer so language models can consume it without re-parsing.
-   [**File enrichment engine**](https://llmind.org/glossary/file-enrichment-engine/) — Software that writes structured, signed semantic metadata into the file's own XMP packet.
-   [**Self-describing file**](https://llmind.org/glossary/self-describing-file/) — A file whose embedded metadata includes the information needed for AI tools to understand its content without external lookups.
-   [**Semantic layer for files**](https://llmind.org/glossary/semantic-layer-files/) — A structured, AI-readable metadata stratum inside the file itself — analogous to a BI semantic layer, applied to files.
-   [**LRFS**](https://llmind.org/glossary/lrfs/) — The LLM-Ready File Specification — LLMind's open spec for in-file semantic metadata and signing.
-   [**C2PA**](https://llmind.org/glossary/c2pa/) — Coalition for Content Provenance and Authenticity — a standard for cryptographically-signed origin metadata in image and video files.
-   [**Content Credentials**](https://llmind.org/glossary/content-credentials/) — Adobe's open-source implementation of the C2PA standard for signed provenance metadata.
-   [**Tamper-evident metadata**](https://llmind.org/glossary/tamper-evident-metadata/) — File metadata that can be cryptographically verified as unmodified since signing.
-   [**Signed semantic metadata**](https://llmind.org/glossary/signed-semantic-metadata/) — Metadata describing a file's meaning, cryptographically signed so consumers can verify integrity.
-   [**Provenance**](https://llmind.org/glossary/provenance/) — The verifiable origin and modification history of a file or its content.
-   [**HMAC-SHA256**](https://llmind.org/glossary/hmac-sha256/) — A keyed message-authentication code built on SHA-256 — LLMind's default signing primitive for LRFS layers.
-   [**File checksum**](https://llmind.org/glossary/file-checksum/) — A fixed-length digest (typically SHA-256) that uniquely identifies a file's byte contents.
-   [**XMP**](https://llmind.org/glossary/xmp/) — Extensible Metadata Platform — Adobe's standard for embedded, structured file metadata in XML-like RDF.
-   [**EXIF**](https://llmind.org/glossary/exif/) — Exchangeable Image File Format — the oldest widely-used standard for image metadata, typically for camera-origin data.
-   [**IPTC**](https://llmind.org/glossary/iptc/) — International Press Telecommunications Council — metadata standard for news photography and editorial imagery.
-   [**Dublin Core**](https://llmind.org/glossary/dublin-core/) — A minimal 15-element metadata vocabulary for describing any resource — often reused as XMP schema.
-   [**XMP namespace**](https://llmind.org/glossary/xmp-namespace/) — A URI that identifies the vocabulary of XMP properties a file uses — LLMind's is https://llmind.org/ns/1.0/.
-   [**Sidecar file**](https://llmind.org/glossary/sidecar-file/) — A separate file (often .xmp) that stores metadata for a primary file, rather than embedding metadata inside it.
-   [**Embedded metadata**](https://llmind.org/glossary/embedded-metadata/) — Metadata stored inside the file itself — e.g., in the XMP packet — rather than in a separate sidecar or database.
-   [**MCP**](https://llmind.org/glossary/mcp/) — Model Context Protocol — Anthropic's open standard for connecting AI agents to tools, files, and data sources.
-   [**RAG**](https://llmind.org/glossary/rag/) — Retrieval-Augmented Generation — a pattern where an LLM retrieves relevant chunks before generating a response.
-   [**Vector database**](https://llmind.org/glossary/vector-database/) — A database optimized for nearest-neighbor search over high-dimensional embedding vectors — common in RAG stacks.
-   [**Chunking**](https://llmind.org/glossary/chunking/) — Splitting a document into smaller passages for embedding and retrieval in a RAG pipeline.
-   [**Embedding**](https://llmind.org/glossary/embedding/) — A dense vector representation of text (or other data) where semantic similarity corresponds to vector proximity.
-   [**Context window**](https://llmind.org/glossary/context-window/) — The maximum amount of text an LLM can read at once, measured in tokens.
-   [**AI agent**](https://llmind.org/glossary/ai-agent/) — An LLM-powered program that can plan, call tools, and act in a loop to accomplish a goal.
-   [**AI Overview**](https://llmind.org/glossary/ai-overview/) — Google's generative answer feature that synthesizes responses from multiple sources and cites them.
-   [**OCR**](https://llmind.org/glossary/ocr/) — Optical Character Recognition — extracting text from an image or scanned page.
-   [**IDP**](https://llmind.org/glossary/idp/) — Intelligent Document Processing — AI-assisted extraction of structured data from documents, including OCR plus understanding.
-   [**DAM**](https://llmind.org/glossary/dam/) — Digital Asset Management — software that stores, organizes, and distributes media files across an organization.

## Explore more

-   [Learn](https://llmind.org/learn/)
-   [Spec](https://llmind.org/spec/)
