CogStack
CogStack is an NHS developed platform composed of a range of modular components, enabling deep search of entire hospital records and use of advanced language AI to extract concepts found in unstructured data (free text records, letters, and summaries) (Figure 1).
The Cogstack platform as a whole includes the following:
- Data pipelines to centralise and lake clinical data including structured data i.e. observations, results, and unstructured data i.e. clinical narratives such as clinic letters, discharge and admission summaries and radiology reports also varying formats e.g. binary word docs, PDFs, images.
- Data held in a database to enable search and visualisation of millions of distinct data points in near-real-time.
- Language AI models process clinical text to standardised clinical terminologies (SNOMED-CT) to create interoperable clinical data. This allows discovery of patient phenotypes and characteristics that were previously only accessible by having humans read through millions of clinical notes.
- ‘Deep phenotyping’ using rich concepts and biomarkers that are not accessible from any other data source, with capability to support clinical research and trial recruitment.
The full platform is fully deployed in three London NHS Trusts, in process of deployment in others. As part of the SDE programme, existing CogStack pipelines will be used to enrich data held within Trusts for analytics and research (including via FLIP with integration to multi-modal data). In addition, integration of CogStack within GP vendor systems and the London Data Service will enable the systematic extraction of concepts from the unstructured primary care record, for the first time in the NHS.