This page describes the data levels, metadata attributes, and file structure for bulk RNA sequencing.
Bulk RNA sequencing identifies the average gene expression profile of a biological sample.
The defined metadata leverages existing common data elements from the Genomic Data Commons (GDC). The HTAN data model currently supports Level 1, 2 and 3 RNA sequencing data:
Level Number | Definition | Example Data |
---|---|---|
1 | Unaligned reads | FASTQ |
2 | Aligned reads | BAM |
3 | Gene level expression, unnormalized | Gene & isoform expression-level data (.csv) |