Back to case study
Business case · Research Intelligence Tools

Business case: a research intelligence console for our corpus

Reference engagement: Andrew W. Marshall Foundation

Executive summary

Commission Syntheos to turn our specialist corpus into a knowledge-graph research console modeled on the one we built for the Andrew W. Marshall Foundation. Researchers ask questions in a chat panel, get answers grounded in numbered citations to specific archive passages, and see a clear visual distinction between verified archive material and AI-generated inference.

The problem

We hold a corpus that matters — documents, oral histories, records, a specialist archive — and almost nobody reads it in depth. New researchers don't know what questions to ask, and they don't know what the archive contains well enough to form a question. We are letting institutional memory atrophy because the access tools are 1995.

Proposed engagement

An 8 to 14 week engagement to ingest the corpus, build a typed knowledge graph, and deploy a research console. The console runs hybrid retrieval (vector, full-text, graph expansion), streaming LLM answers with numbered citations, a right-hand panel showing evidence cards synced to scroll position, and a saved-sessions feature so researchers can return to an investigation later.

What we get

  • A deployed research console running on our hosting
  • The ingestion pipeline that builds the graph from source documents
  • A knowledge graph with typed entities, relationships, and community clusters
  • Verified vs inferred node labeling across the entire graph
  • Documentation for adding new material as the corpus grows

Risks and mitigations

The main risk is noisy extraction on the first pass. Mitigation: the ingestion pipeline runs an iterative gap-fill with community detection and deduplication, and Syntheos budgets SME review time for the first pass of entity and relationship extraction. A second risk is over-confidence in AI inference. Mitigation: the console labels every inferred node and every inferred citation distinctly, so researchers always know what the archive actually says.

Success metrics

  • Researchers find material via the console that they could not find via traditional search
  • Chat answers cite specific passages that the researcher can verify
  • Verified vs inferred labeling is accurate on a sampled review
  • A new researcher can build a productive investigation thread in under an hour

Investment and timeline

8 to 14 weeks depending on corpus size and the quality of existing metadata. Costs are comparable to a medium consulting engagement. The console is an asset that continues to deliver value every time a researcher opens it.

Recommended next step

A walkthrough of the Andrew W. Marshall Foundation console with Syntheos, followed by a scoping conversation about our corpus.

Syntheossyntheos.io