---
title: "LLMind benchmarks | LLMind"
description: "Reference measurements for LLMind file enrichment — file-size overhead, signing throughput, read-time comparison vs. re-parsing with LlamaParse, Docling, Textract, Mistral OCR."
url: https://llmind.org/benchmarks/
source_format: html
---
# LLMind benchmarks

Reference measurements for LLMind file enrichment — file-size overhead, signing throughput, read-time comparison vs. re-parsing with LlamaParse, Docling, Textract, Mistral OCR.

**Measurements pending.** The tables below describe what Sprint 3 measures against the reference corpus at [huggingface.co/datasets/llmind/reference-enriched-pdfs-v1](https://huggingface.co/datasets/llmind/reference-enriched-pdfs-v1) . Cells show `—` until the measurement protocol runs. The methodology is committed at [docs/benchmarks/sprint-3-methodology.md](https://github.com/dmitryrollins/llmind-site/blob/main/docs/benchmarks/sprint-3-methodology.md) and the raw CSV at [docs/benchmarks/sprint-3-data.csv](https://github.com/dmitryrollins/llmind-site/blob/main/docs/benchmarks/sprint-3-data.csv) . Third parties can rerun and publish alternative numbers against the same corpus.

## Test corpus

100 public-domain PDFs at [huggingface.co/datasets/llmind/reference-enriched-pdfs-v1](https://huggingface.co/datasets/llmind/reference-enriched-pdfs-v1) : 40 small (<500KB), 40 medium (500KB–5MB), 20 large (5–50MB). Content mix: technical papers, scanned facsimiles, government reports, synthetic research PDFs. Licensed for redistribution (CC0, public domain, or US government works).

## File-size overhead

The XMP semantic layer LLMind writes adds bytes to the file. This table measures how many — as a percentage of the original, and in absolute bytes.

| Measurement | Value | Notes |
| --- | --- | --- |
| avg-overhead-percent | — | Average XMP payload size as % of original file bytes across the 100-PDF corpus. |
| median-overhead-percent | — | Median (less sensitive to huge files). |
| p95-overhead-percent | — | 95th percentile — worst case for small PDFs. |
| avg-absolute-bytes | — | Average bytes added per file (for intuition — expect single-digit KB). |

## Signing throughput

How fast LLMind can sign the semantic layer on commodity hardware. Measured in isolation (pure crypto; no OCR / parse in the hot path).

| Algorithm | Throughput | Notes |
| --- | --- | --- |
| hmac-sha256 | — | HMAC-SHA256 signing of the semantic layer (default algorithm). |
| ed25519 | — | ed25519 signing (optional; for public-key verification). |
| file-checksum-sha256 | — | SHA-256 file-content checksum (excludes XMP packet). |

## Read-time comparison

The core value proposition: reading the cached LRFS semantic layer from XMP is orders of magnitude faster than re-parsing the same PDF. Lower milliseconds are better.

| Operation | Time per file | Notes |
| --- | --- | --- |
| llmind-cached-read | — | Time for a consumer to parse the XMP packet and extract the layer. |
| llamaparse-reparse | — | Time to re-parse the PDF with LlamaParse (cloud API call; includes network). |
| docling-reparse | — | Time to re-parse with Docling (local |
| textract-reparse | — | Time to re-parse with AWS Textract (cloud API call; includes network). |
| mistral-ocr-reparse | — | Time to re-parse with Mistral OCR (cloud API call). |
| speedup-factor-vs-llamaparse | — | Speedup from reading cached LRFS vs. re-running LlamaParse. |
| speedup-factor-vs-textract | — | Speedup from reading cached LRFS vs. re-running AWS Textract. |

## What's not measured

Some adjacent questions are deliberately out of scope. [The methodology document](https://github.com/dmitryrollins/llmind-site/blob/main/docs/benchmarks/sprint-3-methodology.md) explains each exclusion in detail. In short:

-   **End-to-end RAG answer quality.** Too many variables (LLM, chunker, prompt) to attribute signal to LLMind alone.
-   **Vector-DB retrieval latency.** LLMind isn't a vector DB; a head-to-head is category-confused. See [comparisons](https://llmind.org/compare/).
-   **Concurrent-workload performance.** Sprint 3 is single-threaded reference measurement; multi-process is a Sprint 4+ investigation.

## Reproducibility

The corpus, the methodology, and the raw CSV are all committed to git. Third-party reviewers can rerun each step and publish alternative numbers. Results vary with hardware generation (especially for HMAC-SHA256 throughput), network conditions (for cloud-API baselines), and file selection.

The Sprint 3 CSV stays frozen as the reference point. Future re-runs land at `docs/benchmarks/sprint-4-data.csv`, etc.

[Source code on GitHub](https://github.com/dmitryrollins/LLMind) · [LRFS v1.0 specification](https://llmind.org/spec/lrfs-v1.0/) · [Learn about file enrichment](https://llmind.org/learn/)