Business case: a research intelligence console for our corpus
Executive summary
Commission Syntheos to turn our specialist corpus into a knowledge-graph research console modeled on the one we built for the Andrew W. Marshall Foundation. Researchers ask questions in a chat panel, get answers grounded in numbered citations to specific archive passages, and see a clear visual distinction between verified archive material and AI-generated inference.
The problem
We hold a corpus that matters — documents, oral histories, records, a specialist archive — and almost nobody reads it in depth. New researchers don't know what questions to ask, and they don't know what the archive contains well enough to form a question. We are letting institutional memory atrophy because the access tools are 1995.
Proposed engagement
An 8 to 14 week engagement to ingest the corpus, build a typed knowledge graph, and deploy a research console. The console runs hybrid retrieval (vector, full-text, graph expansion), streaming LLM answers with numbered citations, a right-hand panel showing evidence cards synced to scroll position, and a saved-sessions feature so researchers can return to an investigation later.
What we get
- A deployed research console running on our hosting
- The ingestion pipeline that builds the graph from source documents
- A knowledge graph with typed entities, relationships, and community clusters
- Verified vs inferred node labeling across the entire graph
- Documentation for adding new material as the corpus grows
Risks and mitigations
The main risk is noisy extraction on the first pass. Mitigation: the ingestion pipeline runs an iterative gap-fill with community detection and deduplication, and Syntheos budgets SME review time for the first pass of entity and relationship extraction. A second risk is over-confidence in AI inference. Mitigation: the console labels every inferred node and every inferred citation distinctly, so researchers always know what the archive actually says.
Success metrics
- Researchers find material via the console that they could not find via traditional search
- Chat answers cite specific passages that the researcher can verify
- Verified vs inferred labeling is accurate on a sampled review
- A new researcher can build a productive investigation thread in under an hour
Investment and timeline
8 to 14 weeks depending on corpus size and the quality of existing metadata. Costs are comparable to a medium consulting engagement. The console is an asset that continues to deliver value every time a researcher opens it.
Recommended next step
A walkthrough of the Andrew W. Marshall Foundation console with Syntheos, followed by a scoping conversation about our corpus.