Signed image metadata for the AI era

Published 2026-04-22 · 8 min read

Images get stripped of their metadata every time they pass through a platform. Captions, location, author, editing history, AI-generated flags — most of it disappears by the time a consumer sees the photo. LLMind writes a signed, tamper-evident semantic layer into the image's own XMP packet. When a downstream consumer's AI tool opens the file, it reads structured description, signed origin, and — crucially — cryptographic proof that the metadata hasn't been altered.

The problem: images lose meaning the moment they leave the camera

A photographer shoots an image with captions, location data, and creator metadata embedded in the XMP packet. The moment that image passes through a social media platform, a content delivery network, or even a simple email forward, that metadata is stripped. The consumer who downloads the image sees only pixels — no caption, no location, no indication of edits, no AI-generated flag. If that image then gets re-uploaded to ChatGPT or another AI tool, the tool has no choice but to re-caption the image from scratch, wasting compute and context window.

Even when photographers embed Content Credentials or C2PA manifests to sign origin and editing history, downstream tools often strip or ignore them. The bigger problem is that AI tools re-parsing the image don't read existing metadata at all — they treat the image as opaque pixels. The missing piece: a signed, portable metadata layer that survives platform stripping (where possible) and that AI tools can read natively without re-parsing. Metadata that says “this image shows X, it was edited by Y, and I've cryptographically vouched for that.”

How LLMind signs semantic and provenance data in XMP

LLMind writes structured data into the file's XMP packet under the https://llmind.org/ns/1.0/ namespace. The payload includes:

Description — roughly 100 words of scene understanding (what objects, people, text, and context are visible)
Entities — extracted names, places, organizations, and key nouns the image contains
Transcription — if the image has burned-in text, captions, or overlaid content, that text is extracted and stored
Structural summary — composition, layout, notable visual features
Editing history (optional) — whether the image has been resized, compressed, or edited

Every layer is cryptographically signed. LLMind uses HMAC-SHA256 for layer-by-layer signatures or ed25519 for public-key signing. The entire file carries a SHA-256 checksum. If a consumer tampers with any part of the metadata — changes a caption, strips the entities, modifies the description — verification fails. The file can still be used, but verification signals that the semantic layer has been altered.

This works alongside C2PA Content Credentials. Both write different parts of the same XMP block and don't conflict. You can ship an image with Content Credentials signing the origin (who shot it, who edited it, editing history) and LLMind signing the semantic meaning (what the image shows, transcription, entities). A downstream AI tool or archival system can verify both layers independently.

Working code

The LLMind CLI makes this simple. Enrich an image once:

pipx install 'llmind-cli[all]'

# Enrich an image with signed semantic metadata
llmind enrich --sign photo.jpg

# The command writes the semantic layer directly into the image file.
# The image is still a normal JPEG; the metadata is embedded in XMP.

Any downstream tool (Claude, ChatGPT, a custom vision pipeline, or a file archival system) can verify the semantic layer:

# Verify the signature on a received image
llmind verify photo.jpg

# Output: ✓ Signature valid. Semantic layer untouched.
# Or: ✗ Signature invalid. Metadata has been altered.

If verification fails, the semantic layer is still readable — but the tool knows not to trust it. This matters for workflows where authenticity is important: archival systems, newsroom asset management, or any AI pipeline that needs to know whether the metadata it's reading came from the original source or has been tampered with downstream.

Pair with Content Credentials for full coverage

If your workflow needs both provenance (who made it, how has it changed) and AI-readable semantics (what it shows), run both. Content Credentials signs origin and editing history; LLMind signs the semantic layer. The two systems are complementary. A news organization can publish photos with Content Credentials for origin metadata (so consumers know the photo came from a trusted photographer and hasn't been heavily edited) and LLMind for semantic metadata (so newsroom AI tools can caption, tag, and search the image without re-parsing).

For broadcast or publication pipelines, this pair gives you end-to-end coverage: trust in origin, and AI-readable semantics that travel with the file. See how LLMind and Content Credentials differ for a detailed breakdown of what each signs and when to use them.

The problem: images lose meaning the moment they leave the camera

How LLMind signs semantic and provenance data in XMP

Working code

Pair with Content Credentials for full coverage

Related reading