DAC — Data Analysis Core

Overview

The Data Analysis Core (DAC) is Paris Brain Institute (ICM)‘s bioinformatics and statistical analysis support platform, one of ICM’s 10 core technological platforms. It provides computational expertise, analysis pipelines, training, and consultation to ICM research teams and external collaborators, bridging the gap between data production (at platforms like iGENSEQ, CENIR, and Banque ADN et Cellules) and publication-ready scientific outputs.

The DAC is staffed by bioinformaticians, statisticians, and data scientists who work embedded in or in close collaboration with ICM research teams. It implements FAIR Principles in ICM data management workflows and serves as the primary interface between ICM and national infrastructure providers IFB and OPIDoR.

Divisions and Services

The DAC is organised into four divisions serving both internal ICM teams and external biomedical research projects, on a fee-for-service.

Omics Analysis

Processing and visualisation of high-throughput sequencing (NGS) data from iGENSEQ:

  • Genomics covers gene panels, whole-exome sequencing (WES), and whole-genome sequencing (WGS).
  • Transcriptomics covers bulk RNA-seq, non-coding RNA, small RNA, and splice variants.
  • Single-cell RNA-seq and spatial transcriptomics are supported as primary workflows.
  • Epigenomics covers DNA methylation, ATAC-seq, and ChIP-seq.
  • Long-read sequencing using Oxford Nanopore (ONT) is available for selected applications.
  • Pipelines are built with Snakemake and Conda for reproducibility and portability.
  • DEJAVU is an ICM-internal tool aggregating exome and genome short variants from approximately 3,000 ICM project samples.
  • QUBY is a web interface (QUBY) allowing researchers to self-explore RNA-seq, scRNA-seq, and WES data analysed by DAC.

Biostatistics

The Biostatistics division provides expert statistical support from study design through to data interpretation, including a responsive helpdesk for researchers at any stage of their project. It covers advanced statistical methods for high-dimensional and multimodal data across omics, electrophysiology, cell imaging, neuroimaging, and clinical domains.

Data Management

The Data Management division handles FAIR data management through databases and secure web interfaces aligned with shared data models. It develops and implements ICM’s data management policy and produces Data Management Plans in collaboration with the CDO, legal, data protection, and integrity officers, and the innovation office.

Training and Outreach

The Training division runs a regular programme spanning bioinformatics, biostatistics, and RDM. Tools covered include Git, OMERO+, REDCap, Tumorotek, and OPIDoR. DMP writing sessions are mandatory for ANR and Horizon Europe projects. A statistics helpdesk runs every 1st and 3rd Friday, and custom sessions for research teams are available on request. The recent catalogue (Jan–Mar 2026) included survival analysis, MRI segmentation, single-cell workshops, REDCap training, and spatial transcriptomics focus groups.

Computing Infrastructure

DAC activities run on ICM IT-managed high-performance computing equipment including an Illumina DRAGEN Bio-IT Platform, an 800-core compute cluster, high-performance storage and database servers, and application servers.

Open Science Role

The DAC is ICM’s primary interface with the French and European open science infrastructure. It is a member of IFB and participates in the MUDIS4LS initiative managed by IFB. DAC participates in GT-GeDeM, INCF and GFRN working groups. DMP training and support is provided using OPIDoR tools. The Data Management division implements FAIR data management across ICM through databases, data models, and policy development. Omics pipelines use Snakemake, Conda, and open-source tools throughout.

Connections

Resources