Neurobagel

Overview

Neurobagel is an open-source ecosystem of tools for federated neuroscience cohort discovery across decentralised BIDS datasets. Developed at McGill University in collaboration with CONP and ReproNim, it enables participant-level cohort search across multiple institutions simultaneously, without centralising data or requiring data transfer. Each participating institution deploys a local Neurobagel node. The federation API aggregates query results across all nodes while data files remain under local custody and governance.

Architecture

Neurobagel uses a hub-and-spoke federated model. Each institution deploys a Neurobagel node (containerised, self-hosted) containing a knowledge graph of harmonised participant-level metadata. The federation API (federate.neurobagel.org) aggregates queries across all public nodes simultaneously. The query tool (query.neurobagel.org) provides a web interface for defining cohort criteria and retrieving matching participant lists.

Data Model and Standards

  • Phenotypic data is annotated using BIDS conventions and NIDM terms.
  • Controlled vocabularies include SNOMED CT for diagnosis, with age, sex, and imaging modality standardised from BIDS.
  • Linked data representation uses JSON-LD and knowledge graphs.
  • The DataLad backend supports imaging data download from participating nodes.

Public Nodes

As documented in the public nodes guide, three public nodes are currently available:

  • OpenNeuro provides a growing subset of OpenNeuro datasets annotated by the community.
  • INDI provides public datasets from the International Neuroimaging Data-sharing Initiative.
  • Quebec Parkinson Network provides federated access for discovery only, with no participant-level details or data download.

Companion Tool: Nipoppy

Nipoppy is a Python package that extends BIDS for managing neuroimaging pipeline tracking and curation. It standardises the management and monitoring of neuroimaging data processing workflows, producing Neurobagel-ready metadata as output. The Neurobagel+Nipoppy stack provides a complete pipeline from raw data curation to federated discovery.

Connections

Resources