Health
Health data carries regulatory constraints that do not apply to most research data, because it consists of personal medical information whose misuse could harm individuals. Understanding those constraints is a prerequisite for understanding how health data can be shared, accessed, and standardised for research.
Scope
Health data for research comes from different contexts, each with different consent frameworks, regulatory requirements, and applicable standards.
Electronic health records (EHRs), prescriptions, laboratory results, medical imaging, and administrative claims are generated during patient care primarily to support care delivery. Using this data for research requires a separate legal basis and specific access mechanisms, distinct from those applying to data collected explicitly for research.
Data collected under a formal research protocol, such as clinical trials and prospective cohort studies, involves explicit research consent from participants and different regulatory frameworks. That context is covered by the Clinical Trials perspective and is not addressed further here.
This perspective focuses on health data generated in care settings and how it can be accessed and standardised for research, a practice often described as secondary use of health data. EHDS specifically targets this use case at the EU level.
Data access
Health data generated in care settings cannot generally be published as open data because re-identification risk persists even after standard anonymisation. Three access models are in common use.
Fully anonymised data, such as brain atlas outputs and aggregate cohort statistics, can be deposited in open repositories. The Open Brain Consent framework provides pre-approved consent language enabling open neuroimaging data deposit within an established ethical framework.
Data that cannot be fully anonymised is deposited in controlled-access repositories requiring ethics review and a data use agreement. EGA serves this role in Europe for genomic and clinical data. dbGaP and LONI IDA serve equivalent functions in the USA. In France the Health Data Hub and SNDS operate via CESREES ethical review and CNIL authorisation.
For sensitive datasets that cannot leave source institutions, federated secure processing environments allow analysis to run on data that stays at source. The Health Data Hub Datalab and EBRAINS secure access environments follow this model.
Standards
Interoperability standards enable heterogeneous routine clinical data from different institutions to be integrated for observational research. OMOP CDM is a common data model specifically designed for secondary use of observational clinical data, mapping local coding systems (ICD-10, LOINC, RxNorm, CCAM) into a shared vocabulary. It is not used for research-collected clinical trial data, which uses CDISC standards instead. HL7 FHIR is the API standard for EHR data exchange and the designated interface for primary care data under EHDS. SNOMED CT provides the clinical concept terminology underpinning both OMOP and FHIR implementations. OHDSI maintains OMOP CDM and its tooling ecosystem. BBMRI-ERIC promotes OMOP CDM for cross-biobank data standardisation.
For the European regulatory framework governing health data access and secondary use, see the Europe perspective. For French health data governance and CNIL oversight, see the France perspective. For research-collected clinical trial data standards, see Clinical Trials. For rare disease phenotyping standards, see Rare Disease and Phenotyping.

