centerforcancergenomics genomic data commons...txt (cnv, methylation), maf & vcf (mutations)...
TRANSCRIPT
Center for Cancer Genomics |Genomic Data Commons portal.gdc.cancer.gov | @NCIGDC_Updates | [email protected]
The Genomic Data Commons (GDC) provides the cancer research community with a unified data repository that enables data sharing across cancer genomic studies in support of precision medicine. Access Data & Tools • Harmonized genomic data uniformly processedand aligned to latest reference genome.• Mutation calls, expression levels, and otherhigh-level data generated via best-in-class pipelines. • Web-based tools to search, download and analyze data from over 33,000 cancer cases. • Visualize biological & clinical relationships inreal-time, then download publication-ready figures.• Consume data efficiently with GDC’s API or DataTransfer Tool
Projects Available Clinical Proteomic Tumor Analysis Consortium (CPTAC), Foundation Medicine (FMI), The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and other cancer studies
Examples of Supported Data Types
Data type File Format
Clinical & Biospecimen TSV, XML, JSON
Sequencing (e.g., WGS, WXS, RNA)
BAM, FASTQ
Array (e.g., SNP, Methylation)
TXT, IDAT
High-level Data TSV (expression, splice junctions), TXT (CNV, methylation), MAF & VCF (mutations)
Analyze with DAVE Data Analysis, Visualization & Exploration (DAVE) tools are web-based and open-access, helping make genomic data accessible for anyone.
Mut CNV
OncoGrid Visualize
combinations of gene mutations &
copy number variants for a project or custom cohort
Gen
esCl
inic
al
Alteration Frequencies
mutation copy number
Cases
Survival Analysis
Compare overall survival of any two
cohorts, such as patients with &
without a mutated gene of interest
U.S. Department of Health & Human Services | National Institutes of Health
Analyze with DAVE
Protein Viewer
Visualize gene mutations mapped
to their protein functional domains
Visualize Frequent
Alterations View the most
frequently mutated genes
for any cohort
30
% o
f Cas
es A
ffect
ed
Distribution of Most Frequently Mutated Genes
0
10
20
TP53
PIK3CA
CDH1
GATA3
KMT2C
MAP3K1
PTEN
NCOR1NF1
RUNX1
ARID1A
PIK3R1
MAP2K4
RNF213
SPEN
KMT2D
AKAP9FA
T1MYH9
TBX3
Plot frequencies of cases with
mutations and copy number variants for a selected gene
80
% o
f Cas
es A
ffect
ed
TP53 Cancer Distribution
0
20
40
60
TCGA-UCS
TCGA-OV
CGA-LUSC
GA-ESCA
GA-READ
GA-HNSC
GA-PAAD
GA-COAD
GA-LUAD
GA-BLCA
CGA-STAD
CGA-LGG
GA-UCEC
GA-SARC
GA-BRCA
CGA-KICH
CGA-LIHC
CGA-GBM
CGA-ACC
GA-MESO
T TC TC TC TC TC TC TC T T TC TC TC T T T T TC
ESC
GA-C
TC
% o
f Cas
es A
ffect
ed
TP53 CNV Distribution
TCGA-SARC
TCGA-PRAD
TCGA-UCS
TCGA-OV
TCGA-LIHC
TCGA-DLBC
TCGA-CHOL
TCGA-UCEC
TCGA-ACC
TCGA-BRCA
TCGA-BLCA
TCGA-LUAD
TCGA-ESCA
TCGA-GBM
TCGA-STAD
TCGA-HNSC
TCGA-LUSC
TCGA-MESO
TCGA-SKCM
0
10
20
30
AnalyzeCustom Sets
Create custom sets of cases, genes,
or mutations Compare survival and
clinical features of case sets
portal.gdc.cancer.gov | @NCIGDC_Updates | [email protected]