ChatGPT with documents — and every other AI tool in your workflow

Published 2026-04-22 · 8 min read

Your research workflow spans a dozen AI tools. ChatGPT for quick lookups. Claude for longer reasoning. NotebookLM for source-grounded research. Perplexity for citation-heavy search. Each tool wants you to upload your files again. And each tool rebuilds its own representation from scratch. LLMind writes a signed semantic layer into each file, and every AI tool reads the same layer — ChatGPT with documents, Claude with documents, every tool you use.

The multi-tool research workflow

A typical researcher keeps files scattered across folders. A folder of research papers. A folder of market reports. A folder of meeting notes. A folder of interview transcripts (audio, sometimes just notes). Another folder of competitor analysis docs. They bounce between different AI tools depending on the task.

ChatGPT for quick factual lookups. Claude for long-form analysis and reasoning across multiple sources. NotebookLM for building a source-grounded research notebook. Perplexity for citation-heavy search and fact-checking. Sometimes Cursor for diving into technical documentation. Sometimes a custom Python script that feeds documents to an API. Each tool is better at different things. Each tool wants the files.

The inefficiency: each tool wants you to upload the same files separately. ChatGPT doesn’t remember your uploads from last week. Claude has a different session each time. NotebookLM creates its own internal representation of your corpus. Perplexity indexes them in its own way. Every time you switch tools, you re-upload. Every tool rebuilds the same understanding: what does this paper say? What are its key claims? What entities and sources does it mention? This is expensive work, quota-intensive, and it happens redundantly across systems.

What gets lost each time you re-upload

Each AI tool builds an internal representation of your files. Some tools extract summaries, key entities, and section breakdowns. Better tools cross-reference documents and build relationship graphs. All of this understanding lives inside that tool’s session or index. It’s powerful and specific to that tool’s model and algorithms.

But it’s ephemeral. You close the ChatGPT conversation, it’s gone. You start a new NotebookLM project next week with the same papers, NotebookLM rebuilds everything from scratch. Your prompts carry the context across sessions (a good prompt can summarize your papers for a new tool), but the computational work is redundant. You’re paying quota to Claude for it to re-read papers you fed it a week ago.

The ideal case: the understanding of what your documents contain (structure, key points, entities, summaries, transcriptions) is stored once, in a standard format, inside the files themselves. Every AI tool you use reads the same layer. You switch tools without re-uploading, and every tool gets the same structured foundation. The tool can choose to re-analyze (ChatGPT might do that anyway), but it has the metadata available.

LLMind-enriched corpus for ChatGPT, Claude, NotebookLM, Perplexity

Enrich a directory of research files once with llmind enrich ~/research/. Every file now carries a signed semantic layer inside its XMP packet: AI-generated description of the paper’s content, extracted entities (authors, organizations, key concepts), structural summary (sections, key claims), transcription (for audio files or scanned documents with OCR).

Drop the enriched files into any AI tool. Which tools actively read the XMP layer today? Honest answer: it varies. Some tools are starting to parse XMP for metadata. Others re-analyze completely. But every tool gets the file plus its embedded layer. As tools increasingly read metadata natively, the compounding benefit grows. You’re not betting on a single vendor; you’re hedging your entire research corpus with a portable, standard metadata layer.

For researchers who work with PDFs, markdown files, audio recordings, and scanned documents, the enrichment happens once. The semantic layer (description, entities, transcription) is computed once and then available to every downstream tool. As you build your research corpus over weeks and months, the amortized cost per file approaches zero.

Workflow: llmind enrich ~/Notes/ and point your AI tool at it

The pattern is simple:

# Enrich your research corpus once
llmind enrich ~/research/

# Enrich audio files (transcription is embedded)
llmind enrich ~/interviews/ --transcribe

# Enrich scanned PDFs (OCR text is embedded)
llmind enrich ~/scans/ --ocr

# Now drop any file into ChatGPT, Claude, NotebookLM, or Perplexity
# — the semantic layer travels with the file

# Or point Cursor at the folder for technical research:
# Settings → Add custom docs → ~/research/
# Cursor reads the embedded metadata for better context

The enrichment is one-time. After that, your research files are metadata-rich and portable. You can reference them across sessions, tools, and workflows. A paper you enriched six months ago is still enriched when you pull it into a new research project. The semantic layer doesn’t expire; it travels with the file.

When you still benefit even if tools don’t parse XMP

Not every AI tool natively parses XMP today. ChatGPT, Claude, and NotebookLM re-analyze files they receive. But the metadata is there. If you’re using LLMind-enriched files in your research, you could also manually include the semantic summary in your prompts: “This paper (metadata: describes X, key entities Y, claims Z) is attached. Analyze it in context of...” But the point is that the metadata is available when tools add native support. You’re not locked into tool-specific vendors or proprietary indexing schemes. Your files are self-contained and portable.

Related reading

Star on GitHub · Install CLI