LRFS payload format

LRFS v1.0 · Published 2026-04-22

This chapter specifies how an LLM-Ready File Specification (LRFS) payload is structured, serialized, and canonicalized. It is normative — implementers conforming to LRFS v1 MUST follow the rules in this chapter.

1. Host packet

The LRFS payload is carried inside an XMP packet per ISO 16684-1. The packet must be placed in a location specific to each host file format:

Implementers MUST reference the XMP specification (ISO 16684-1) and the format-specific appendices for precise byte-level placement rules. This chapter does not duplicate XMP packet serialization — it assumes the packet is correctly embedded per the standard.

2. Namespace binding

The LRFS namespace URI is https://llmind.org/ns/1.0/. Implementers MUST bind the prefix llmind to this URI in the XMP RDF document. Example:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:llmind="https://llmind.org/ns/1.0/">
  <!-- LRFS properties follow -->
</rdf:RDF>

Other namespace prefixes (e.g., dc for Dublin Core, xmp for XMP, xmpRights) MAY coexist in the same RDF document.

3. Layer model

An LRFS payload contains zero or more named layers. A layer is an RDF property whose subject is the file resource (typically the unnamed root resource in XMP). The following layers are defined in LRFS v1.0:

Each layer is optional. An LRFS payload with zero layers is valid but conveys no semantic information. Implementations MUST gracefully handle missing layers.

4. Canonicalization for signing

Before signing, each layer MUST be canonicalized to enable reproducible signature verification. The canonicalization algorithm is the RDF 1.1 canonicalization algorithm (ISWC/W3C RDF Dataset Canonicalization, RFC 8785 JCS is NOT used; we use RDF-specific canonicalization to match RDF/XML semantic equivalence).

The process:

  1. Extract the RDF triples for the layer from the XMP RDF graph.
  2. Apply RDF Dataset Canonicalization (https://www.w3.org/TR/rdf11-datasets/#canonicalization) to these triples.
  3. Serialize the canonical triples as N-Quads with UTF-8 encoding and \n line endings.
  4. The resulting byte string is the canonical form for that layer.

Critical: Implementers MUST NOT sign the serialized RDF/XML representation directly. Signing must use the canonical N-Quads form. RDF/XML can serialize the same triples in different textual forms, leading to different byte strings and signature mismatches.

5. Numeric precision

Floating-point values in confidence scores, temporal offsets, and similar numeric properties MUST be represented with at most 6 decimal places in the canonical N-Quads form. Implementers rounding to more or fewer digits will produce different canonical byte strings and will fail signature verification. Use rounding rules: if a value is 0.123456789, round to 0.123457 (round-to-nearest-even).

6. Character encoding

UTF-8 encoding is mandatory throughout. XMP packets specify their own encoding in the XML declaration. LRFS requires encoding="UTF-8" and MUST reject payloads that declare or use any other encoding (e.g., UTF-16, ISO-8859-1).

7. Version compatibility

Within the v1.x version family, new optional layers MAY be added in minor versions (e.g., v1.1, v1.2). Readers of an older v1.x MUST ignore unknown llmind:* properties and MUST NOT fail validation.

Any change that invalidates existing payloads or breaks existing signatures requires a new major version with a new namespace URI (e.g., https://llmind.org/ns/2.0/). The v1.0 namespace will remain stable forever and will never be reused.

8. Related chapters