FASTQ — Raw Sequencing Read Format

Overview

FASTQ is the standard format for storing raw sequencing reads, the direct output of next-generation sequencing instruments. Described by Cock et al. (2010, Nucleic Acids Research), it stores base calls and their associated Phred-scaled quality scores in a simple four-line-per-read text format. FASTQ has no formal governance body but is universally adopted as the starting point of every genomics pipeline. Files are almost always gzip-compressed in practice.

Position in the Genomics Pipeline

FASTQ is the upstream input to alignment, which produces SAM-BAM-CRAM files. These are then processed for variant calling (producing VCF) or expression quantification (producing count matrices in AnnData for single-cell data).

Connections

relatedTo: SAM-BAM-CRAM (after alignment)

Resources

https://doi.org/10.1093/nar/gkp1137 (Cock et al. 2010, Nucleic Acids Research)
https://www.ebi.ac.uk/ena (ENA)
https://www.ncbi.nlm.nih.gov/sra (NCBI SRA)

Graph View

Explorer

FASTQ — Raw Sequencing Read Format

Overview

Position in the Genomics Pipeline

Connections

Resources

Graph View

Table of Contents

Backlinks

Graph View

Explorer

FASTQ — Raw Sequencing Read Format

Overview

Position in the Genomics Pipeline

Connections

Resources

Graph View

Tags

Table of Contents

Backlinks