OHDSI — Observational Health Data Sciences and Informatics

Overview

OHDSI (pronounced “Odyssey”) is an international, open-science collaborative established in 2014 that generates evidence from observational health data through transparent, reproducible, and multi-database analytics. It maintains the OMOP CDM standard, develops the open-source tools that run on it, and coordinates a global network of 300+ data sources covering nearly 1 billion patient records across 30+ countries as of 2024, enabling large-scale pharmacoepidemiological and outcomes research. Where OMOP CDM is the data standard, OHDSI is the community and network that makes it operationally useful.

What OHDSI Produces

Open-Source Tools

OHDSI develops and maintains the HADES (Health Analytics Data-to-Evidence Suite), a collection of R packages for observational research on OMOP CDM data:

ToolPurpose
ATLASWeb-based cohort definition, characterisation, incidence rates, population-level estimation; no-code interface
ACHILLESAutomated data quality and characterisation; generates 170+ statistics on CDM contents
CohortDiagnosticsDiagnose phenotype algorithms across databases
FeatureExtractionExtract covariates for patient-level prediction and estimation
PatientLevelPredictionTrain and validate ML models for clinical outcomes
CohortMethodPopulation-level effect estimation (comparative cohort studies)
SelfControlledCaseSeriesSelf-controlled case series analysis
CyclopsLarge-scale regression engine (L1/L2 regularised)
DataQualityDashboardSystematic data quality assessment against the Kahn framework
White RabbitSource data profiling for ETL design
Rabbit in a HatVisual ETL mapping tool

Network Studies

OHDSI coordinates multi-database network studies — analytical studies that run identical code across all participating databases, generating site-level results that are meta-analysed centrally. The OHDSI COVID-19 Studies (2020) characterised 34,000+ hospitalised patients across 34 databases in 13 countries within 3 weeks of the pandemic. LEGEND-T2DM compared the safety of second-line type 2 diabetes treatments across 11 databases and 1.5 million patients. OHDSI also runs neurological disease network studies covering dementia, epilepsy, and Parkinson’s disease incidence and treatment patterns.

Phenotype Library

OHDSI maintains an open Phenotype Library of validated ATLAS cohort definitions for hundreds of clinical conditions, enabling researchers to reuse rigorously validated phenotype algorithms for disease cohorts.

Connections

Resources