session 3: genetics and genomics - unc...

53
Session 3: Genetics and Genomics Magnuson (not available), Mieczkowski, Jones, Berg, Jeck, Rathmell May 25, 2011

Upload: others

Post on 11-Sep-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Session 3: Genetics and Genomics Magnuson (not available), Mieczkowski, Jones, Berg, Jeck, Rathmell

May 25, 2011

Page 2: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

High Throughput Sequencing Facility at UNC

Piotr Mieczkowski Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill May 25th 2011

Page 3: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability
Page 4: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

SERVER

SERVER

SERVER

Nitrogen

gas

UPS PacBio RS SERVER

Architecture 2011 (Carolina Crossing)

Page 5: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

HiSeq 2000

SHORT READS PLATFORM at UNC

Initial capability: up to 200Gb per run (8 days). Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability after upgrade of the system scheduled for end of 2011: 1Tb per run. Cost of resequencing of one human genome (30x coverage) Current - about $6,000 End of June - $4,000 One run (11 days): 5 human genomes 48 DNA capture (all exomes)

Page 6: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Adapter compatibility for HiSeq sequencing system.

“Old” adapters vs “New” adapters:

Adapters for SE (Single End) applications: • “Old” PE (Paired End) • “Old” SE (Single End) • “New” PE TruSeq Adapters for PE (Paired End) applications: • Only “New” TruSeq (note: “old” PE adapters chemistry is NOT compatible with HiSeq chemistry)

Adapters for Multiplex applications: • “Old” PE • “New” PE (TruSeq)

Page 7: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

What 3rd Generation promised to deliver?

Single molecule resolution in real time • Short time to result and simple workflow

– Base-call generation in <1 day

– Polymerase speed ≥1 base per second

• No amplification required – Bias not introduced

– More uniform coverage

• Direct observation – Distinguish heterogeneous samples

– Simultaneous kinetic measurements

• Long reads – Identify repeats and structural variants

– Less coverage required

• Information content – One assay, multiple applications

• Genetic variation (SVs to SNPs)

• Methylation

• Enzymology

Page 8: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

PacBio RT 3rd generation DNA sequencing system

Page 9: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

NEXT-GENERATION SEQUENCING (DEEP SEQUENCING) PLATFORMS

o Short reads

1. Genome Analyzer IIx (GAIIx), HiSeq2000, MiSeq – Illumina

2. SOLiD 5500xl System – Applied Biosystem

3. HeliScope™ Single Molecule Sequencer - Helicos

o Long reads

1. Genome Sequencer FLX System (454) – Roche

2. PacBio RS - Pacific Bioscience (commercial release 2011)

3. Personal Genome Machine - Ion Torrent

o Mapping sequences to the large DNA fragments

1. NABsys

2. Bionanomatrix

Page 10: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Clinical samples collection

Processing – DNA capture of XXX genes

Sequence data – treatment options

Page 11: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

• 1 mil reads • up to 200bp length of the reads • Analysis software – CLC Genomic Workbench

Page 12: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Building a Computational Infrastructure to Support NextGen Sequencing

Corbin Jones

Biology & CCGS

Faculty Director HTSF

Page 13: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

The Task

Generate

Process

Analyze

Interpret

Page 14: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

HTSF Large & Busy

0

1000

2000

3000

4000

5000

6000

2008-2009Total

2010 2011 Q1 2011Projected

Addiction TCGA

General

~1 Trillion nt/week 16-80 Tbytes per week

Page 15: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

$1,040,999 LCCC Server

Upgrades, 8%

KURE Blades, 30%

KURE Isilon Storage, 27%

Data Backup, 35%

RC, 46%

GGTT, 54%

Sai Balu

Ruth Marinshaw

Page 16: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Results

New Current

Processors 64x8 cores 2.93 GHz cores

32x8 cores 2.6 GHz

Memory 72 GB per 8 core node

26 GB per 8 core node

Disk 195 TB Isilon Filetek Backup (Total of 909 TB available)

32 TB

0 20 40 60 80

N

C

0 20 40 60 80

N

C

0 50 100 150 200

N

C

Page 17: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

PIPE DB

BSP LIMS

Lib Prep Sample

Flowcell

Seq Ware

You.

PIPE

Bioinformatics

Tape Archive

By hand Auto

TCGA

Page 18: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Analyze

SeqWare EveryWare

Generate

Process

Analyze

Interpret

Page 19: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

SeqWare Transition Schedule

Jeff Roach & Co. ITS-RC

Page 20: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

PIPE SeqWare Sample Submission

Page 21: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

The Future

Generate

Process

Analyze

Interpret

Page 22: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Harnessing the power of genetics in whole genome analysis of

hereditary cancer susceptibility

Jonathan S. Berg, MD/PhD Department of Genetics

Department of Medicine/Hematology-Oncology

Page 23: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Clinical Cancer Genetics at UNC

• Assessment and genetic testing for suspected hereditary cancer predisposition

• >5000 families evaluated over 15 years

• Majority of patients with breast/ovarian cancer, many with GI cancer, polyposis

• Comprehensive database of pedigrees, risk calculations, and genetic test results

Page 24: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Hypothesis • Inidividuals highly suspicious for hereditary

cancer, who test negative for known genes,

carry mutations of novel cancer susceptibility

genes

Plan

• Identify variants using whole genome sequencing in a subset of study participants/families

• Select candidate mutations

• Test other patients for mutations in the same gene

Page 25: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Analytic approaches

• Family-based – WGS in paired affected

family members

– Identification of rare, likely deleterious variants that are shared

– Segregation analysis in other family members

• Phenotype-based – WGS in multiple unrelated

individuals

– Identification of genes in which affected individuals have rare, likely deleterious variants

– Utilize pedigree information regarding inheritance pattern

Page 26: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Enrollment

• Study team identified and triaged >100 probands

– pre-test probability of a mutation

– size of pedigree

– informative relatives available for testing

• Current enrollment:

– 77 breast/ovarian probands, 43 informative family members

– 19 polyposis probands, 3 informative family members

– 5 other cancers

• Consent for whole genome analysis and blood samples obtained

Page 27: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Breast/ovarian study participants are similar to known BRCA1/2 families

BRCAPro scores Age at breast cancer diagnosis

BR

CA

Pro

(to

tal)

Age

BRCA1+ BRCA2+ Study probands (BRCA1/2 neg)

p < 0.01 p < 0.01

BRCA1+ BRCA2+ Study probands (BRCA1/2 neg)

Page 28: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Whole genome sequencing

• Breast cancer – 16 individuals from 8 families

• 6 proband/relative pairs, 1 trio of cousins, 1 unpaired

• Polyposis – 8 unrelated individuals

• Mixture of simplex (AR or new mutation dominant) and AD pedigrees

– 2 members of a dominant pedigree

• 4 samples sequenced by UNC HTSF

• 22 samples sequenced by Complete Genomics (10 currently in transit)

Page 29: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Analysis (in progress)

• Collaboration with RENCI

– Pipeline for variant calling and annotation (UNC samples) based on Broad Institute’s GATK

– Database for storage of genomes and cross-comparisons

• 1000 genomes variant frequency data

• Protein prediction tools

• Human Gene Mutation Database

Page 30: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Analysis (in progress)

• Breast cancer:

– Strong candidate gene with frameshift mutation segregating in one family and possible splice site mutation segregating in another family

– Putative function in RAD50 pathway, mice heterozygous for deletion develop cancer

– Collaborating with Chuck Perou’s lab

Page 31: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Analysis (in progress)

• Polyposis patients:

– Strong candidate gene with heterozygous rare missense mutations in two different individuals

– Related to APC (the known cause of FAP)

– Further studies in progress

Page 32: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Next steps

• Analysis of incoming 10 genomes

• Follow-up candidate mutations in families

• Sequence candidate genes in unrelated individuals

• Functional studies

• Next batch of genomes to be sequenced (price coming down rapidly)

Page 33: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Questions?

UNC Cancer Genetics Clinic Jim Evans Kristy Lee Cecile Skrzynia Catherine Fine Ofri Leitner Kate Major

RENCI Kirk Wilhelmsen Charles Schmitt Chris Bizon Nassib Nassar

Students Jonathan Mathew Michael Adams

Page 34: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Targeted Somatic Mutation Discovery

for Clinical Care in Cancer

William Jeck

Genetics Curriculum

MD/PhD Program

Page 35: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Cancer is a Genetic Disease:

• “Somatic” mutations occur in cancers and determine the biology of the disease.

• Identifying the pathogenic “driver” mutations that cause the cancer will predict prognosis and response to therapy.

• Identification of all driver mutations is needed, but is not currently done.

Page 36: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Sequence Capture

UNCeq 3.0

Tumor

Normal DNA Illumina Libraries

or Somatic Calls

Page 37: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Why Use Sequence Capture:

• Separates diagnostic approach from discovery

• Estimated that > 90% of driver mutations in common solid cancers occur in < 50 genes

• Capture target is flexible

• Capture can save time, effort and money

Page 38: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

UNCeqTM Gene List 3.0 Capturing Exons of:

AKT1 ALK APC AR ATM BRAF BRCA1 BRCA2 CCND1 CDH1 CDKN2A CDKN2B CTNNB1 EGFR EPHA10 EPHA6 ERBB2 FAM123B FBXW7 FGFR2 FGFR3 FLT1 FRAP1 HECW1 HER4 HRAS IDH1 IDH2 KIT KRAS MET MSH6 MYC NF2 NRAS PAK7 PDGFRA PHF6 PIK3CA PTCH1 PTEN PTK2 RAF1 RB1 SMAD4 STAT3 STK11 TET1 TET2 TP53 UTX UTY Additional Regions:

All introns of BRCA1/2 Intron 19 of ALK HPV genes E6 and E7

Page 39: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Result of Sequence Capture

Page 40: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Detecting Point Mutations (B-RAF)

WM2664

Sk-Mel 24

Sk-Mel 28

Control

BRAF V600E

Page 41: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Detecting Deletions (PTEN)

WM2664

Sk-Mel 24

Sk-Mel 28

Control

log(c

overa

ge)

Exon 2 Exon 6

Page 42: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Detecting Haploinsufficiency

UTX – chromosome X

Exons

Male 1

Male 2

Female 1

Female 2

Male 1

Male 2

Female 1

Female 2

BRCA1 – chromosome 17

Exons

Page 43: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Detecting Translocations & CNV

Page 44: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Using NextGen for Better Cancer Care:

• We can sequence 50-150 genes in patients’ tumor / germline in <4 weeks for < $1,500 per patient

• We see the vast majority of genetic events (PMs, dels, amps)

• Expect IRB approval for use in any patient summer of 2011,

• Plan to sequence ~10K patient tumors from the UCRF-funded cancer survivorship cohort

• Validated discoveries will be disclosed to patients and physicians when consistent with treatment guidelines, or if standard of care has been exhausted

Page 45: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Acknowledgements • Ned Sharpless Lab

– Patrick Dillon

– Christin Burd

– Alex Siebold

– Soren Johnson

– Jessica Sorrentino

– Chad Torrice

• Derek Chiang Lab

– Gleb Savych

• Chuck Perou Lab

– George Chao

• Neil Hayes Lab

– Xiaoying Yin

• Billy Kim Lab

– Jeff Damrauer

• Jonathan Berg

• Nancy Thomas

• Janiel Shields

• Juneko Grilley-Olson

• Jeanne Noe

• Corbin Jones

• Piotr Mieczkowski

Funding by the UCRF

Page 46: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Ian Davis W. Kimryn Rathmell

William Kim Terry Furey Jason Lieb

Chromatin Remodeling of RCC

Page 47: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Nucleosome loss indicates regulatory activity

DNA binding motif DNA binding protein Nucleosome

ON

Repressed Poised Active

X OFF

X

POISED

Page 48: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Influences on nucleosomal position

Doerr, Nat. Methods, 2007

Histone modification Methylation Acetylation Phosphorylation

Nucleosome repositioning Active – SWI/SNF Passive – nucleotide composition

MLL SETD2 UTX JARID1C

PBRM1 ARID1A

RCC: HIF

Page 49: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Preliminary analysis

• 4 tumors selected for initial analysis, 2 paired normals for comparison.

• ChIP-seq

– H3K4me1 (poised)

– H3K4me3 (active)

– H3K27me1 (repressed)

• FAIRE

– Global regions of open chromatin

Page 50: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

H3K4me1 analysis, one gene, differential states

Page 51: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Predicted cis-regulatory regions (Genomic Regions Enrichment Annotation Tool, Bejerano)

Does clustering identify anything meaningful?

Gene list Gene ontology 836 regions associated with 1147 genes *HIF network **Hypoxia in cancer cells **DMOG in cancer Rose Brannon

Jeremy Simon

Page 52: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

UCRF Proposal

• Perform FAIRE and Histone methylation-specific ChIP-seq on increased numbers of selected tumors.

• Selection criteria:

– ccA or ccB subtype

– IHC detection of H3 methylation marks

– PBRM1 mutated vs wild type

Page 53: Session 3: Genetics and Genomics - UNC Linebergercancer.unc.edu/lcccnewsletter/genetics-and-genomics.pdf · Chemistry v3 enabled capability (current): up to 500-600Gb per run Capability

Conclusions:

• This analysis will enable us to:

– Determine global chromatin remodeling effect of PBRM1 mutations on RCC.

– Link transcriptional readouts to chromatin patterns

– Identify common elements that underlie the transforming events in RCC

• THANK YOU