HTAN Data Coordinating Center


Data Coordinating Center

The HTAN Data Coordinating Center (DCC) proudly supports each of the HTAN atlas teams and pilot programs as well as the broader Network by coordinating Network activities; providing centralized resources for data and resource storage and access within HTAN as well as dissemination to the wider scientific community; developing powerful data analysis and visualization tools to enable researchers to make novel discoveries using HTAN data; and conducting outreach to the community. Within HTAN, the DCC also leads various efforts to develop an extensible data model as well as clinical data and metadata standards that will ensure the accessibility and interoperability of HTAN data with the wider cancer research data ecosystem.

Comprised of individuals from four institutions, the DCC team brings to HTAN extensive experience participating in major research consortia, including TCGA, AACR Project GENIE, and the Cancer Systems Biology Consortium, as well as leading open-source software development projects such as cBioPortal for Cancer Genomics, Synapse, and the ISB Cancer Genomics Cloud (ISB-CGC) and community engagement projects such as DREAM Challenges. The DCC team draws upon this collective experience and deep technical expertise in biomedical science and data science to support the goals of HTAN and to promote a collaborative research and discovery environment within HTAN and the wider cancer research community.

Principal Investigators


Dr. Ethan Cerami is Director of the Knowledge Systems Group and Principal Scientist in the Department of Data Sciences at Dana-Farber Cancer Institute (DFCI). He has an M.S. in Computer Science from New York University and a Ph.D. in Computational Biology from Cornell University. Prior to joining Dana-Farber, Dr. Cerami was Director of Computational Biology at Blueprint Medicines and Director of Cancer Informatics Development at Memorial Sloan Kettering Cancer Center (MSKCC). While at MSKCC, Dr. Cerami co-founded the cBioPortal for Cancer Genomics, and his current group at DFCI remains active in its continued development while also being central contributors to other major consortium efforts such as the NCI-funded Cancer Immunologic Data Commons and AACR Project GENIE.


Dr. Eddy is the Director of Architecture & Operations within the Data & Tooling group at Sage Bionetworks. He is a computational biologist and cancer researcher with experience in developing and managing high-throughput molecular databases, bioinformatics pipelines, and analytical workflows. He and his team serve as key technical contributors on the data coordinating center for several large consortia, including the NCI Cancer Systems Biology Consortium (CSBC) and the INvestigation of Co-occurring conditions across the Lifespan to Understand Down syndromE (INCLUDE) program. He is also PI for multiple open science tool/infrastructure development projects, including CRI iAtlas and an NCI-funded Informatics Technology for Cancer Research (ITCR) grant for advancing method benchmarking and data sharing through crowd-sourced competitions. Dr. Eddy is well versed in best practices for reproducible data science in biomedical research. He has worked closely with the Global Alliance for Genomics & Health (GA4GH) to develop standards for data sharing and analysis.


Dr. Niki Schultz is Associate Attending in the Computational Oncology Service of the Department of Epidemiology and Biostatistics and Affiliate Member of the Human Oncology and Pathogenesis Program at Memorial Sloan Kettering Cancer Center (MSKCC). As head of the Knowledge Systems Group in the Marie-Josée and Henry R. Kravis Center for Molecular Oncology, he leads development of the cBioPortal for Cancer Genomics, a web-based resource for analysis of complex cancer genomics data, and of OncoKB, a precision oncology knowledge base. His research focuses on identifying the genomic alterations that underlie cancer, their mechanisms of action, and novel therapeutic approaches. Dr. Schultz has made significant contributions to several projects of TCGA and AACR Project GENIE, and he is an investigator in the Stand Up to Cancer Prostate Cancer Dream Team. He has a particular interest in enabling discoveries by developing novel computational methods and databases that help bridge the divide between computer scientists on one side and clinicians and researchers on the other.


Dr. Vésteinn Thorsson is a Senior Research Scientist at the Institute for Systems Biology. His research encompasses cancer genomics and immuno-oncology, and he has extensive experience working with data analysis and data coordination in collaborative cancer genomics projects. As part of The Cancer Genome Atlas (TCGA) Research Network, Dr. Thorsson contributed substantially to published studies on gastrointestinal tumors, including serving as both Data and Analysis Coordinator and playing a key role in determining gastric molecular subtypes. Dr. Thorsson also served as Co-Chair of a working group that recently completed a comprehensive analysis of all TCGA gastrointestinal tumor samples and of a working group dedicated to characterizing immune response in the more than 10,000 TCGA tumor samples. In addition, he serves as a project lead for the CRI iAtlas project (, an interactive web resource for immuno-oncology research.