HTAN Biospecimen Data

The HTAN biospecimen data model is designed to capture essential biospecimen data elements, including:

Acquisition method, e.g. autopsy, biopsy, fine needle aspirate, etc.
Topography Code, indicating site within the body, e.g. based on ICD-O-3.
Collection information e.g. time, duration of ischemia, temperature, etc.
Processing of parent biospecimen information e.g. fresh, frozen, etc.
Biospecimen and derivative clinical metadata i.e. Histologic Morphology Code, e.g. based on ICD-O-3.
Coordinates for derivative biospecimen from their parent biospecimen.
Processing of derivative biospecimen for downstream analysis e.g. dissociation, sectioning, analyte isolation, etc.

HTAN biospecimen metadata leverages existing common data elements from four sources:

Attributes

WARNING: Manifests provided on this page are for reference only. DO NOT USE THESE MANIFESTS FOR DATA SUBMISSION.

Directions

The interactive tables below are provided to help users understand the HTAN Data Model. The tables allow a user to view, search or download attributes either:

in a specific manifest; or
in all manifests represented on this page.

To view a specific manifest, click on the link in the Manifests tab. The manifest will appear in a new tab on the page. Navigate to the new tab to search for attributes or download the manifest.
To search for attributes among all manifests, navigate to the All Attributes tab and use the search box provided at the top of the tab. All attributes can also be downloaded as a csv file.

Manifests
All Attributes

Manifest

▲

Description

▲

Biospecimen

HTAN biological entity; this can be tissue, blood, analyte and subsamples of those

Attribute

▲

Manifest Name

▲

Description

▲

Required

▲

Conditional If

▲

Data Type

▲

Valid Values

▲

HTAN Biospecimen ID

- Biospecimen

HTAN ID associated with a biosample based on HTAN ID SOP (eg HTANx_yyy_zzz)

True

String

Source HTAN Biospecimen ID

- Biospecimen

This is the HTAN ID that may have been assigned to the biospecimen at the site of biospecimen origin (e.g. BU).

False

String

HTAN Parent ID

- Biospecimen

HTAN ID of parent from which the biospecimen was obtained. Parent could be another biospecimen or a research participant.

True

String

Timepoint Label

- Biospecimen

Label to identify the time point at which the clinical data or biospecimen was obtained (e.g. Baseline, End of Treatment, Overall survival, Final). NO PHI/PII INFORMATION IS ALLOWED.

True

String

Collection Days from Index

- Biospecimen

Number of days from the research participant's index date that the biospecimen was obtained. If not applicable please enter 'Not Applicable'

True

String

Adjacent Biospecimen IDs

- Biospecimen

List of HTAN Identifiers (separated by commas) of adjacent biospecimens cut from the same sample; for example HTA3_3000_3, HTA3_3000_4, ...

False

String

Biospecimen Type

- Biospecimen

Biospecimen Type

True

String

- tissue biospecimen type

- blood biospecimen type

- analyte biospecimen type

- mouth rinse biospecimen type

- stool biospecimen type

- urine biospecimen type

- ascites biospecimen type

- sputum biospecimen type

- fluids biospecimen type

- bone marrow biospecimen type

- cells biospecimen type

Acquisition Method Type

- Biospecimen

Records the method of acquisition or source for the specimen under consideration.

True

String

- autopsy

- biopsy

- fine needle aspirate

- surgical resection

- punch biopsy

- shave biopsy

- excision

- re-excision

- sentinel node biopsy

- lymphadenectomy - regional nodes

- other acquisition method

- non induced sputum

- induced sputum

- bal (bronchial alveolar lavage)

- cytobrush

- blood draw

- fluid collection

- forceps biopsy

- core needle biopsy

- endoscopic biopsy

- not specified

Fixative Type

- Biospecimen

Text term to identify the type of fixative used to preserve a tissue specimen

True

String

- acetone

- alcohol

- formalin

- glutaraldehyde

- oct media

- rnalater

- saline

- 95% ethanol

- dimidoester

- carbodiimide

- dimethylacetamide

- para-benzoquinone

- paxgene tissue

- tcl lysis buffer

- np40 lysis buffer

- methacarn

- cryo-store

- carnoy's fixative

- polaxamer

- other

- none

- unknown

- unfixed

Storage Method

- Biospecimen

The method by which a biomaterial was stored after preservation or before another protocol was used.

True

String

- ambient temperature

- cut slide

- fresh

- frozen at -20c

- frozen at -70c

- frozen at -80c

- frozen at -150c

- frozen in liquid nitrogen

- frozen in vapor phase

- paraffin block

- rnalater at 4c

- rnalater at 25c

- rnalater at -20c

- refrigerated at 4 degrees

- refrigerated vacuum chamber

- 4c in vacuum chamber

- desiccant at 4c

- not applicable

- unknown

Processing Days from Index

- Biospecimen

Number of days from the research participant's index date that the biospecimen was processed. If not applicable please enter 'Not Applicable'

True

String

Protocol Link

- Biospecimen

Protocols.io ID or DOI link to a free/open protocol resource describing in detail the assay protocol (e.g. surface markers used in Smart-seq, dissociation duration, lot/batch numbers for key reagents such as primers, sequencing reagent kits, etc.) or the protocol by which the sample was obtained or generated.

True

String

Site Data Source

- Biospecimen

Text to identify the data source for the specimen/sample from within the HTAN center, if applicable. Any identifier used within the center to identify data sources. No PHI/PII is allowed.

False

String

Collection Media

- Biospecimen

Material Specimen is collected into post procedure

False

String

- dmem

- dmem+serum

- rpmi

- rpmi+serum

- pbs

- pbs+serum

- none

Mounting Medium

- Biospecimen

The solution in which the specimen is embedded, generally under a cover glass. It may be liquid, gum or resinous, soluble in water, alcohol or other solvents and be sealed from the external atmosphere by non-soluble ringing media

False

String

- aqueous water based

- non-aqueous solvent based

- xylene

- toluene

- antifade with dapi

- antifade without dapi

- pbs

- unknown

- not reported

Processing Location

- Biospecimen

Site with an HTAN center where specimen processing occurs, if applicable. Any identifier used within the center to identify processing location. No PHI/PII is allowed.

False

String

Histology Assessment By

- Biospecimen

Text term describing who (in what role) made the histological assessments of the sample

False

String

- pathologist

- research scientist

- other

- unknown

Histology Assessment Medium

- Biospecimen

The method of assessment used to characterize histology

False

String

- digital

- microscopy

- other

- unknown

Preinvasive Morphology

- Biospecimen

Histologic Morphology not included in ICD-O-3 morphology codes, for preinvasive lesions included in the HTAN

False

String

- melanocytic hyperplasia

- atypical melanocytic proliferation

- melanoma in situ - superficial spreading

- melanoma in situ - lentigo maligna type

- melanoma in situ - acral-lentiginous

- melanoma in situ - arising in a giant congenital nevus

- persistent melanoma in situ

- melanoma in situ - not otherwise classified

- scar - no residual melanoma

- invasive melanoma - superficial spreading

- invasive melanoma - nodular type

- invasive melanoma - lentigo maligna

- invasive melanoma - acral lentiginous

- invasive melanoma - desmoplastic

- invasive melanoma - nevoid

- invasive melanoma - other

- normal wda

- reserve cell hyperplasia

- squamous metaplasia - mature

- squamous metaplasia - immature

- mild dysplasia

- moderate dysplasia

- severe dysplasia

- squamous carcinoma in situ

- atypical adenomatous hyperplasia

- adenocarcinoma in situ - non mucinous

- adenocarcinoma in situ - mucinous

- benign tumor nos

- hamartoma

- carcinoma nos

- no diagnosis possible

Tumor Infiltrating Lymphocytes

- Biospecimen

Measure of Tumor-Infiltrating Lymphocytes [Number]

False

String

Degree of Dysplasia

- Biospecimen

Information related to the presence of cells that look abnormal under a microscope but are not cancer. Records the degree of dysplasia for the cyst or lesion under consideration.

False

String

- normal or basal cell hyperplasia or metaplasia

- mild dysplasia

- moderate dysplasia

- severe dysplasia

- carcinoma in situ

- unknown

Dysplasia Fraction

- Biospecimen

Resulting value to represent the number of pieces of dysplasia divided by the total number of pieces. [Text: max length 5]

False

String

Number Proliferating Cells

- Biospecimen

Numeric value that represents the count of proliferating cells determined during pathologic review of the sample slide(s).

False

String

Percent Eosinophil Infiltration

- Biospecimen

Numeric value to represent the percentage of infiltration by eosinophils in a tumor sample or specimen.

False

String

Percent Granulocyte Infiltration

- Biospecimen

Numeric value to represent the percentage of infiltration by granulocytes in a tumor sample or specimen.

False

String

Percent Inflam Infiltration

- Biospecimen

Numeric value to represent local response to cellular injury, marked by capillary dilatation, edema and leukocyte infiltration; clinically, inflammation is manifest by redness, heat, pain, swelling and loss of function, with the need to heal damaged tissue.

False

String

Percent Lymphocyte Infiltration

- Biospecimen

Numeric value to represent the percentage of infiltration by lymphocytes in a solid tissue normal sample or specimen.

False

String

Percent Monocyte Infiltration

- Biospecimen

Numeric value to represent the percentage of monocyte infiltration in a sample or specimen.

False

String

Percent Necrosis

- Biospecimen

Numeric value to represent the percentage of cell death in a malignant tumor sample or specimen.

False

String

Percent Neutrophil Infiltration

- Biospecimen

Numeric value to represent the percentage of infiltration by neutrophils in a tumor sample or specimen.

False

String

Percent Normal Cells

- Biospecimen

Numeric value to represent the percentage of normal cell content in a malignant tumor sample or specimen.

False

String

Percent Stromal Cells

- Biospecimen

Numeric value to represent the percentage of reactive cells that are present in a malignant tumor sample or specimen but are not malignant such as fibroblasts, vascular structures, etc.

False

String

Percent Tumor Cells

- Biospecimen

Numeric value that represents the percentage of infiltration by tumor cells in a sample.

False

String

Percent Tumor Nuclei

- Biospecimen

Numeric value to represent the percentage of tumor nuclei in a malignant neoplasm sample or specimen.

False

String

Fiducial Marker

- Biospecimen

Imaging specific: fiducial markers for the alignment of images taken across multiple rounds of imaging.

False

String

- nuclear stain - dapi

- fluorescent beads

- grid slides - hemocytometer

- adhesive markers

- other

- unknown

- not reported

Slicing Method

- Biospecimen

Imaging specific: the method by which the tissue was sliced.

False

String

- vibratome

- cryosectioning

- tissue molds

- sliding microtome

- sectioning

- other

- unknown

- not reported

Lysis Buffer

- Biospecimen

scRNA-seq specific: Type of lysis buffer used

False

String

Method of Nucleic Acid Isolation

- Biospecimen

Bulk RNA & DNA-seq specific: method used for nucleic acid isolation. E.g. Qiagen Allprep, Qiagen miRNAeasy. [Text - max length 100]

False

String

HTAN Parent Biospecimen ID

- Biospecimen

HTAN Biospecimen Identifier (eg HTANx_yyy_zzz) indicating the biospecimen(s) from which these files were derived; multiple parent biospecimen should be comma-separated

True

- Is lowest level is "Yes - Is lowest level"

String

Ischemic Time

- Biospecimen

Duration of time, in seconds, between when the specimen stopped receiving oxygen and when it was preserved or processed. Integer value.

False

- Biospecimen is "Bone"'

- 'Biospecimen is "Analyte"'

- 'Biospecimen is "Tissue"'

- 'Biospecimen is "Urine"

String

Ischemic Temperature

- Biospecimen

Specify whether specimen experienced warm or cold ischemia.

False

- Biospecimen is "Bone"'

- 'Biospecimen is "Analyte"'

- 'Biospecimen is "Tissue"'

- 'Biospecimen is "Urine"

String

- warm ischemia

- cold ischemia

- ambient air

- 4c wet ice

- negative -20c

- dry ice

- liquid nitrogen

- unknown

Histologic Morphology Code

- Biospecimen

The microscopic anatomy of normal and abnormal cells and tissues of the specimen as captured in the morphology codes of the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3). Example - 8010/0

True

- Biospecimen is "Bone"'

- 'Biospecimen is "Tissue"'

- 'Biospecimen is "Urine"

String

Preservation Method

- Biospecimen

Text term that represents the method used to preserve the sample.

True

- Biospecimen is "Bone"'

- 'Biospecimen is "Tissue"'

- 'Biospecimen is "Urine"

String

- cryopreserved

- cryopreservation in liquid nitrogen - dead tissue

- cryopreservation in dry ice - dead tissue

- cryopreservation in liquid nitrogen - live cells

- formalin fixed paraffin embedded - ffpe

- formalin fixed-unbuffered

- formalin fixed-buffered

- fresh

- fresh dissociated and single cell sorted into plates in np40 buffer

- oct

- snap frozen

- frozen

- negative 80 deg c

- liquid nitrogen

- fresh dissociated

- fresh dissociated and single cell sorted

- fresh dissociated and single cell sorted into plates

- methacarn fixed paraffin embedded - mfpe

- unknown

- not reported

Analyte Type

- Biospecimen

The kind of molecular specimen analyte: a molecular derivative (I.e. RNA / DNA / Protein Lysate) obtained from a specimen

True

- Biospecimen is "Analyte"

String

- cfdna analyte

- dna analyte

- rna analyte

- total rna analyte

- tissue block analyte

- tissue section analyte

- pbmcs or plasma or serum analyte

- cdna libraries analyte

- pbmcs

- plasma

- serum analyte

- lipid

- protein

- metabolite

Tissue Biospecimen Type

- Biospecimen

Tissue biospecimen

False

String

Blood Biospecimen Type

- Biospecimen

Blood biospecimen

False

String

Analyte Biospecimen Type

- Biospecimen

A molecular derivative (I.e. RNA / DNA / Protein Lysate) obtained from a specimen

False

String

Urine Biospecimen Type

- Biospecimen

Urine biospecimen

False

String

Bone Marrow Biospecimen Type

- Biospecimen

Bone Marrow biospecimen

False

String

Fixation Duration

- Biospecimen

The length of time, from beginning to end, required to process or preserve biospecimens in fixative (measured in minutes)

True

- Biospecimen is "Analyte"

String