LRFS conformance test vectors

Public test fixtures for LRFS v1.0 conformance. JSON files that third-party LRFS readers, writers, and verifiers validate against.

Every LRFS-conformant implementation — whether written in Python, Rust, Go, Node, or another language — should pass these test vectors. Each vector is a self-contained JSON file with inputs and expected fields. Inputs are what the implementation receives; expected values are what the implementation must produce.

The reference generator (scripts/generate-spec-vectors.ts in this repo) uses Node's built-in crypto module to compute the expected HMAC-SHA256, ed25519, and SHA-256 values. The same inputs should produce byte-identical output in any conformant language implementation.

Summary

  • Specification: LRFS v1.0
  • Namespace: https://llmind.org/ns/1.0/
  • Total vectors: 15
  • Last generated:
  • Index JSON: /spec/test-vectors/index.json (machine-readable list of all vectors)

Canonicalization

Reference: LRFS v1.0 §3.3

  • 001-single-text-layer Single layer containing the ASCII string "hello world". The canonical input is layer-name + 0x00 + value + 0x00.
  • 002-multiple-layers-alphabetical Three layers serialized in alphabetical order by layer name. Verifies that writers sort consistently regardless of insertion order.
  • 003-json-structure-layer Layer containing JSON-as-string. The JSON itself must be serialized with sorted keys and no whitespace before canonicalization (LRFS v1.0 §3.3 rule 1).
  • 004-utf8-multibyte Layers containing multibyte UTF-8 content (French accents, Japanese). Verifies that canonicalization is byte-based (not code-point-based) and handles all Unicode consistently.
  • 005-signature-layer-excluded The llmind:signature layer MUST be excluded from the canonical input when computing a signature (LRFS v1.0 §3.3 rule 2). This vector verifies writers correctly drop it during canonicalization.

HMAC-SHA256 signing

Reference: LRFS v1.0 §4.1

  • 001-basic HMAC-SHA256 signature over the canonical input of a single-layer payload. Fixed ASCII key; reproducible output.
  • 002-multi-layer HMAC-SHA256 signature over a three-layer payload after canonicalization. Verifies that the signature is stable across writer implementations as long as canonicalization is correct.

ed25519 signing

Reference: LRFS v1.0 §4.2

  • 001-basic ed25519 signature over the canonical input of a single-layer payload. Uses RFC 8032 test vector #1 secret key for reproducibility; any Ed25519 implementation should produce the same signature.

File checksum

Reference: LRFS v1.0 §4.3

  • 001-pdf-prefix SHA-256 of a minimal PDF file prefix (10 bytes). Demonstrates how the file-checksum layer is computed over the file bytes excluding the XMP packet.
  • 002-nist-abc SHA-256 of the 3-byte ASCII string "abc". Standard NIST FIPS 180-4 test vector; any conformant SHA-256 implementation produces the expected hex digest.

Failure modes

Reference: LRFS v1.0 §5 + §7

  • 001-tampered-signature Signature bytes do not match the canonical input under the declared key and algorithm. Any conformant verifier MUST reject this payload.
  • 002-malformed-xmp XMP packet is syntactically invalid XML — truncated before the closing RDF element. Conformant readers MUST surface a structured parse error, not crash or silently return partial data.
  • 003-signature-algorithm-mismatch Declared signing algorithm is incompatible with the provided signature byte length. Conformant readers MUST reject before attempting verification.
  • 004-empty-payload XMP packet is well-formed and contains the LRFS namespace, but declares zero layers and no signature. This is distinct from a file with no LRFS metadata at all. Conformant readers MUST treat this as a file with no semantic layer (not a parse error, not a verification failure) and return an empty layer set without crashing.
  • 005-unknown-namespace-version XMP uses a future-version namespace (v2.0) unknown to a v1.0 reader. Per LRFS v1.0 §7 version compatibility, v1.0 readers MUST ignore unknown major-version namespaces — not crash, not misinterpret the layers as v1.0, not return an error.

How to use these vectors

A conformance test for a new LRFS implementation typically looks like:

// Pseudocode — adapt to your language
for (const vectorFile of readDir('/spec/test-vectors/canonicalization/')) {
  const v = JSON.parse(readFile(vectorFile));
  const canonical = myImplementation.canonicalize(v.inputs.layers);
  const expected = hexToBytes(v.expected.canonical_bytes_hex);
  assertEqual(canonical, expected,
    `Vector ${v.id} failed: canonicalization mismatch`);
}

The index.json file lists every vector with its category, description, and spec section. An implementation's CI can walk the index, load each JSON fixture, and run the corresponding assertion.

Claiming conformance

LRFS v1.0 defines three conformance levels (see the conformance chapter): L1 reader, L2 writer, L3 full. An implementation claims a level by passing every vector in the relevant category:

  • L1 reader: must parse each canonicalization vector and verify each signature vector (HMAC-SHA256, ed25519, file-checksum).
  • L2 writer: must produce byte-identical canonical output for each canonicalization vector and produce identical signatures for each HMAC / ed25519 / file-checksum vector given the stated inputs.
  • L3 full: pass every vector as both reader and writer, and additionally round-trip LRFS packets across file-format conversions the implementation supports.

When publishing an LRFS implementation, include a CONFORMANCE.md file that states which level the implementation claims and links to its CI run against this vector set.