introduction to chip-seq analysis using avadis ngs · • model-based analysis of chip-seq (zhang,...
TRANSCRIPT
Introduction to Chip-SeqAnalysis using Avadis NGS
Page 1
January 2010
Agilent Confidential
January 2010
Agilent Confidential
Jean Jasinski, Ph.D.Senior Application Scientist
Avadis NGS v1.2
• Designed for Biologists
• GeneSpring & Mass Profiler Pro on Avadis platform (Integrated Biology)
• Windows, Mac, Linux (min 2GB RAM, 100GB HD, 1 CPU)
• Accepts SAM/BAM/ELAND files (from any sequencing vendor)
Supports NGS applications
• ChIP-Seq
• RNA-SEQ
• DNA-Seq
Provides visualization, analysis,
biological contextualization tools
2
Avadis NGSDownstream Analysis of NGS data
Data File (Reads + Quality)
Reads aligned to genome
FASTQ BAM, SAM, BED, ELAND
3
Data File(Reads + Quality)
SoftwareControl
SoftwareELAND, BWA, TopHat
Reads aligned to genome
Avadis NGS
FASTQ
ChIP-SEQ -Transcription Regulation
•Find peaks in coverage (regions in genomes with overabundance of reads)
•Detect Transcription Factor binding sites and common motifs
•Find genes up/downstream of motifs for biological contextualization
Experiment Workflows in Avadis NGS
RNA-SEQ -Transcription
•Measure expression levels of known transcripts
•Find novel genes, exons, alternative splicing events
June 11Page 4
DNA-SEQ - Variant Analysis
•Discover SNP, MNP, InDels & Annotate with dbSNP
•Analyze effect of variations on coding regions (frame shifts, InDels, etc.)
•Detect large structural variations (re-arrangements, translocations, etc.)
•Identify Gene Fusions
•Discover SNP/InDels and Annotate with dbSNP (previously published variants)
Data Interpretation ToolsData Visualization and Biological Contextualization
Annotations available directly within ToolHuman, Mouse, Rat, Zebrafish, Drosophila, Yeast , C. elegans, Arabidopsis (many others plus custom builds possible)
Create custom genome
June 11Page 6
Updates and new annotations available
Layout of the Tools
Project and Experiment Navigator
Workflow Browser
7
Region View
ChIP-Seq Questions and Avadis NGS answers
1. Where are binding sites?
2. Why does protein bind there?
3. Which genes are affected?
1. Peak Detection (PICS, MACS, Enriched Region)
2. Motif Detection (GADEM)
3. Translate Regions to Genes3. Which genes are affected?
4. Which processes are affected?
5. What other genes might also bind this factor?
3. Translate Regions to Genes
4. GO Analysis, Find Significant Pathways, Pathway Analysis
5. Scan Motif
June 11Page 8
Peak Detection AlgorithmsWhere are binding sites?
•Unenriched (control) sample optional —used for FDR for PICS and MACS; used to “normalize” Enriched Region Detection
June 11Page 9
Region Detection
PICS AlgorithmWhere are binding sites?
• Probabilistic Inference for ChIP-Seq (Zhang, et al., 2010. Biometrics)
• Most stringent of peak detection algorithms (narrowest peaks)• Uses direction (forward/reverse) of reads• Optional Mappability Profile • Parallelized (run on >1 CPU)
June 11Page 10
MACS AlgorithmWhere are binding sites?
• Model-based Analysis of ChIP-Seq (Zhang, et. al., 2008 Genome Biology)
• Implementation based on version 1.3.7.1• Identifies broader peaks than PICS• Uses strand (direction) of reads
June 11Page 11
Enriched Region Detection AlgorithmWhere are binding sites?
• Written for Avadis NGS• Uses sliding window• Does not use orientation of reads or shape of peaks• May be used for detection of methylation sites or other large
pile-ups of reads
June 11Page 12
Motif Detection (GADEM)Why does protein bind there?
• Genetic Algorithm guided formation of spaced Dyads couple with EM algorithm (Li 2009. Journal of Computational Biology)
• User-specified padding (up- and down-stream of peak regions).
• Results viewed in WebLogo Format; exported as JASPAR.• Results viewed in WebLogo Format; exported as JASPAR.
June 11Page 13
Translate Regions to GenesWhich genes are affected?
June 11Page 14
GO Analysis, Find Significant Pathways, Pathway Analysis Which processes are affected?
June 11Page 15
Scan MotifWhat other genes might also bind this factor?
Translate Regions to Genes
GeneSpring
June 11Page 16
Hierarchical Data Organization
Region lists Contiguous chromosomal regions identified by Peak detection (PICS/ MACS)
Aligned ReadsAligned data, Reads and coverage
MotifsFrequently occurring sequence motifs identified by GADEM
June 11Page 17
Entity ListsCollection of genes or transcripts, Results from GO analysis, gene detection algorithms
Pathway and NetworksNetwork diagrams and significant pathways
Features Common to All Experiment Types
• Workflow changes according to experiment type
• Genome Browser, drag-and-drop, variable views
• Filters (Read Quality, Mapping Quality, Flow Quality, etc.)
• Powerful Utilities
• Biological Contextualization tools
• R and Python script editors
June 11Page 18
Appropriate tools visible for each Analysis type
19
ChIP-SEQ
DNA-SEQ
RNA-SEQ
Genome Browser Features
• View reads, regions, genes, annotations
• Zoom level changes details. (Peaks → reads → SNPs)
• Search for any gene on any chromosome; browser will show that gene
• Add any annotation track
June 11Page 20
Filtering Reads
GeneSpring
June 11Page 21
Lane QC and Filtering (Illumina only)
GeneSpring
June 11Page 22
Powerful Utilities
BED files (results from others)
Import WIG files
Overlay annotations from 2nd region l
GeneSpring
June 11Page 23
Overlay annotations from 2 region l
Do results agree?
JASPAR format
Filter Genes by any annotation or value
Powerful UtilitiesRegion List Operation
June 11Page 24
Visualize Filter
Manipulate
NLP generated relationsNLP extraction pipeline: Majority of pathway interaction database relations are derived from published Pubmed abstracts using text-mining.
PubMed
Molecular and Process/ Functions
InteractionsTEXT MINING
Input SentenceSyntax Semantics
DictionaryProteins, Enzymes etc..
Input Sentence Entity Recognition Tagged
Sentence
Syntax Semantics
Inference
Apply grammar rules to
derive interactions
Run tool to add custom content
Interactions Network
Input: transcription factor
June 11Page 26 26
Blue Halo:Genes inList
MeSH Pathways
June 11Page 27
Created with Medical Subject Heading term; can overlay your gene list
R and Python Editors
• Customize features by calling BioConductor routines
• Depository at www.avadis-ngs.com
June 11Page 28
Comparing ChIP -Seq Results to …
• RNA-Seq Results: Use Avadis NGS and analyze RNA-Seq experiment and ChIP-Seq experiments separately. Put both experiments in same project. Compare differentially regulated genes with genes in region lists of ChIP-Seq experiment using Venn Diagram tool, overlaying pathways, …
• Gene Expression (microarray): Export gene lists from Avadis • Gene Expression (microarray): Export gene lists from Avadis NGS ChIP-Seq experiment and import into GeneSpring GX. Use Venn Diagram or pathway overlays.
• Bind to other Transcription Factor? Import second JASPAR motif; run Scan Motif to identify regions where TF binds; compare region lists or annotate regions lists.
June 11Page 29
Avadis NGS v1.2
Available from Agilent Technologies
Released Nov. 2010
Supports ChIP-Seq, RNA-Seq, and DNA-Seq
Download trial version, demo datasets, tutorials at: http://www.avadis-ngs.comhttp://www.avadis-ngs.com
GeneSpring 11.9 (summer 2011) will support RNA-Seq and DNA-Seq workflows only
GeneSpring
June 11Page 30
Analysis of ChIP-Seq DataAvadis NGS
Questions?
Page 31
January 2010
Agilent Confidential
January 2010
Agilent Confidential
Thank you for your time.
www.avadis-ngs.com