Business case: a research intelligence console for our corpus
Executive summary
For roughly the cost of one consulting engagement of comparable scope, Syntheos turns our specialist corpus into a knowledge-graph research console our team owns from then on. New researchers reach the depth of investigation that previously required years of corpus familiarity. The institutional memory we hold today survives the people who carry it. Every claim in the console is labeled as verified archive material or generated inference, so a researcher always knows whether they are reading the archive or reading the model. The platform was built with the Andrew W. Marshall Foundation around Andrew Marshall's net assessment tradition and has been generalized to handle other specialist corpora.
The problem
We hold a corpus that matters. Documents, oral histories, records, a specialist archive built over decades. Almost nobody reads it in depth, because the access tools are 1995. A new researcher can't form a useful question because the archive's contents are opaque from the outside. Fluency takes years of working alongside people who lived through the period, and those people are leaving, retiring, or busy elsewhere. Institutional memory does not transfer through PDFs.
Proposed engagement
Eight to fourteen weeks. Syntheos ingests the corpus, builds the knowledge graph, deploys the console, and trains our researchers on the first wave of investigations. Our subject matter experts validate the entity schema and name the relationships that matter for our domain. By week fourteen the console runs on our infrastructure, our researchers have used it for real work, and the documentation for adding new material is in our team's hands.
What we get
- Researchers asking the corpus questions and getting answers in seconds. The depth of investigation that previously required years of corpus familiarity is available to a new researcher in their first week.
- Institutional memory that survives the people who carry it. The knowledge graph is the durable form of what currently lives in a few people's heads.
- Every claim in the console labeled as verified or inferred. A researcher always knows whether they're reading the archive or reading the model.
- Independence from the vendor. The pipeline, the graph, the console, and the source material run on our infrastructure under our license. We own the system.
- A corpus that grows. Documentation for ingesting new material is part of the engagement, so adding to the archive is a routine operation our team performs without Syntheos.
Risks and mitigations
- What if Syntheos disappears. The console, the pipeline, the knowledge graph, and the source material all run on our infrastructure under our license. The system continues to operate without Syntheos personnel.
- What if a researcher cites an inferred claim as if it were verified. Every claim in the chat carries a verified-or-inferred tag, every node and every edge in the graph is similarly labeled, and inferred content renders distinctly from verified content. The console refuses to hide the distinction.
- What if our corpus contains sensitive material. The console can be deployed in air-gapped or access-controlled environments. Source material does not leave the boundary we set.
- What if our researchers do not adopt it. Adoption is a real risk for any new tool, and we treat it as a scoping question rather than an afterthought. The first wave of investigations is designed with our team so the console launches on use cases that matter, and a researcher cohort is trained during the engagement.
- What if we want to leave. License terms allow our team to fork and operate the console independently. We own the graph, the documentation, and the data outright.
Success metrics
- A new researcher builds a productive investigation thread in their first week.
- Researchers find archive material via the console that traditional search did not surface.
- Chat answers reliably cite passages a researcher can verify against the source on the spot.
- Twelve months in, our team is adding new corpus material without Syntheos in the loop, and the graph is growing.
Investment and timeline
Eight to fourteen weeks for the first deployment. Cost varies with corpus size, the complexity of the entity schema, and the metadata quality of the source material, so a single public range would be misleading. A scoping conversation produces a fixed first-year number before any contract is signed. For most engagements that first-year investment lands close to the cost of one medium consulting engagement. Annual hosting and support is a small percentage of the first-year cost. We can self-host or use Syntheos-managed infrastructure depending on the sensitivity of the source material.
Recommended next step
Send Syntheos fifty documents from our corpus. Within two weeks they'll run them through the ingestion pipeline, build a small graph, and walk us through what the console looks like on our material. There is no fee. By the end of the two-hour walkthrough, we'll know whether the system handles our corpus the way we need it to.