Ontologies

An ontology is a controlled vocabulary of defined terms, organised in a hierarchy so that data annotated by different groups can be compared and queried together. Ontologies describe what a dataset records about the world, its anatomy, cell types, molecular functions, phenotypes, behaviours, and diseases, as distinct from the metadata that describes the dataset as an object (covered in Data Discoverability). Most of the ontologies relevant to neuroscience are coordinated by the OBO Foundry, which sets shared design principles so the separate ontologies interoperate rather than overlap, each covering a distinct slice of the description problem and connecting to its neighbours. Several of the disease and phenotype ontologies are produced by the Monarch Initiative, a consortium that integrates gene, disease, and phenotype data across species and develops the ontologies that make that integration possible. The result is a division of labour across what a neuroscience dataset needs to describe: anatomy, cell type, molecular function, phenotype, behaviour, and disease.

Anatomy, cell type, and function

For anatomy, UBERON provides cross-species brain region terms, so a structure can be named once and matched across human, mouse, and other model species, and it is used across EBRAINS, NWB, and the Allen Institute for Brain Science to enable search by anatomical location independent of species-specific naming. For cell type, Cell Ontology (CL) covers neuronal and glial classes down to specific subtypes and is required for single-cell data annotation in CELLxGENE and BICAN. For molecular function, GO (Gene Ontology) describes the biological processes, molecular functions, and cellular components of gene products, and draws on ChEBI for the chemical entities, drugs, and neurotransmitters those functions involve.

Phenotype and behaviour

Phenotype is described by a paired set of ontologies split by species. HPO (Human Phenotype Ontology) is the standard for human clinical phenotypes and is widely used in rare disease genomics, while MP (Mammalian Phenotype Ontology) is its model-organism counterpart for mouse, rat, and other mammalian phenotypes. The two co-maintain a published cross-species mapping, so a phenotype observed in a mouse model can be matched to the corresponding human disease phenotype for model discovery and disease-gene prediction. uPheno (Unified Phenotype Ontology) builds on these mappings to provide a single species-neutral layer under which the human, mouse, and other species-specific phenotype terms are grouped, so cross-species queries run against one vocabulary rather than a set of pairwise mappings. NBO (Neurobehavior Ontology) complements both, covering behavioural and neurological phenotypes (motor, cognitive, sensory, affective, and circadian) in humans and model organisms alike.

Disease

Disease is described separately from phenotype. MONDO (Monarch Disease Ontology) harmonises the major disease classifications, integrating ICD-10, ICD-11, OMIM, and ORDO into a single hierarchy with a unified identifier per disease that maps to the corresponding code in every source system, which makes a disease queryable across databases that each use a different classification. ORDO (Orphanet Rare Disease Ontology) provides the rare-disease backbone MONDO builds on, and OMIM supplies the curated gene-disease relationships. The clinical terminologies that code diagnoses, procedures, and observations in health and trial data (SNOMED CT, ICD-10, LOINC, MeSH, and the drug vocabularies) are covered in the Health and Clinical Trials perspectives, since their primary role is operational coding rather than research annotation.

For the phenotyping standards and variant curation infrastructure that build on these phenotype and disease ontologies, see Rare Disease and Phenotyping. For the metadata, identifiers, and registries that describe and locate the dataset itself, see Data Discoverability.