Download - Bioinformatics at IITA
![Page 1: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/1.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Andreas Gisel
IITA – Bioscience & Bioinformatics
![Page 2: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/2.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Bioinformatics – definition and introduction
Bioinformatics @ IITA
Bioinformatics & IITA
![Page 3: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/3.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics - definition
Bio – Biology, Life Sciences
Informatics – computational sciences
DATA INTERPRETATIONS
RESU
LTSBio informatics
![Page 4: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/4.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics - definition
Bio – Biology, Life Sciences
Informatics – computational sciences
DATA INTERPRETATIONS
RESU
LTS
Data Repositories
Knowledge
![Page 5: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/5.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics - definition
Bio – Biology, Life Sciences
Informatics – computational sciences
DATA INTERPRETATIONS
Bioinformatics is an interdisciplinary science that develops and improves on methods of analyzing biological data and storing, retrieving, organizing, and visualizing them.
This is in order to support to solve biological problems and discover the wealth of biological information hidden in biological data.
![Page 6: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/6.jpg)
www.iita.orgA member of CGIAR consortium
?
Biological Data
DescriptionsPictures
![Page 7: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/7.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences
Biological Data
![Page 8: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/8.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences Protein RNA DNA
First fully sequenced bio-sequence amino acid of insulin (51aa) 1955
First fully sequence nucleic acid tRNA (75nt) 1965
First DNA Bacteriophage (5375nt) 1977
DNA sequencing Sanger sequencing technology (1975) Pyrosequencing (Next Generation sequencing 2004)
Biological Data
![Page 9: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/9.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences Protein RNA DNA
Structures
Biological Data
![Page 10: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/10.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences Protein RNA DNA
Structures Protein RNA
Biological Data
![Page 11: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/11.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences Protein RNA DNA
Structures Protein RNA
Interactions
Biological Data
![Page 12: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/12.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences Protein RNA DNA
Structures Protein RNA
InteractionsExpressions
Biological Data
![Page 13: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/13.jpg)
www.iita.orgA member of CGIAR consortium
Up to 600’000’000’000 (600GB) bases per experiment
Data Explosion
DescriptionsPictures
Sequences Protein RNA DNA
Structures Protein RNA
InteractionsExpressions M
icroarray
High Throughput sequencing
Up to 1 million data points per experiment
NGS(Next Generation Sequencing)
![Page 14: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/14.jpg)
www.iita.orgA member of CGIAR consortium
DescriptionsPictures
Sequences Protein RNA DNA
Structures Protein RNA
InteractionsExpressions
Data Explosion
![Page 15: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/15.jpg)
www.iita.orgA member of CGIAR consortium
Data Analysis – DNA/RNA sequences
Sequence without knowledge connected to it is meaningless!What to do?
Sequence similarityFinding genes and regulatory elementsFunctional analysis of genesHomologyPolymorphism
BIOINFORMATICS
![Page 16: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/16.jpg)
www.iita.orgA member of CGIAR consortium
Data Analysis
So we need bioinformatics tools and reference data
Hardware – Computing infrastructure (CPU, RAM, Storage)
Tools – Programs that process your data
Reference data – Databases for existing data
INTERNET– connection to external Databases
![Page 17: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/17.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Personel
Livia Stavolone – molecular biologist
Deborah Adeyele – student (training in bioinformatics and non-coding RNA)
Toyin Abdulsalam – research fellow (bioinformatics and transcriptom analysis)
Andreas Gisel
Whole Bioscience Team
![Page 18: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/18.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Hardware – Computing infrastructure (CPU, RAM, Storage)
HP Blade, with: 3 blades with each 2 16-core processors (AMD Opteron Processor 6272), 384Gb RAM 2Tb attached storage (DAS)8TB attached storage (NAS)
The operating system is Ubuntu 14.04.1 LTS installed via biolinux 8.
![Page 19: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/19.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
Basic bioinformatics services mainly based on sequence analysis
Next Generation Sequencing data analysis pipelines including:
GBS (genotyping by sequencing) data analysis and SNP callingTranscriptomics (RNA-seq) mapping, assembly and expression profilingsmallRNA data analysis: discovery and expression profilingDNA methylation (BS-seq) data analysisDNA (shotgun) assembly and variation callingGenome annotation using different data pipelines and visualization
Customized approaches using perl and shell scripting
![Page 20: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/20.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
Cassava1200GB compressed sequence data (~5500 accessions) SNP matrix
5500 x ~160’000SNPsYam200GB compressed sequence data (~800 accessions) 800 x ~25’000SNPs
Raw sequencing data SNP matrix
Cornell SNP calling (TASSEL)
Broad SNP calling (GATK)
![Page 21: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/21.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
![Page 22: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/22.jpg)
www.iita.orgA member of CGIAR consortium
GBS (genotyping by sequencing) data analysis and SNP calling
Ismail Rabbi
Bioinformatics @ IITA
Tools – Programs that process your data
SNP matrix
Cornell
![Page 23: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/23.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
SNP matrix
In-house
![Page 24: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/24.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
SNP matrix
![Page 25: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/25.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
SNP matrix
![Page 26: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/26.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
SNP matrix
![Page 27: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/27.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
GBS (genotyping by sequencing) data analysis and SNP calling
SNP matrix
External data
In-house developed scripts
![Page 28: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/28.jpg)
www.iita.orgA member of CGIAR consortium
GBS (genotyping by sequencing) data analysis and SNP calling
Bioinformatics @ IITA
Tools – Programs that process your data
Chr10
Chr1
Chr4
Chr6
Chr5
Chr2
Chr3
Chr7
Chr8
Chr18
Chr9
Chr16
Chr17
Chr15
Chr13
Chr14
Chr12
Chr11
Cassava Assembly & Annotation Version 6.1
![Page 29: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/29.jpg)
www.iita.orgA member of CGIAR consortium
Cassava Assembly & Annotation Version 6.1
GBS (genotyping by sequencing) data analysis and SNP calling
Bioinformatics @ IITA
Tools – Programs that process your data
Gene Distribution
SNP Distribution
GBS Coverage
Heterocygosity
![Page 30: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/30.jpg)
www.iita.orgA member of CGIAR consortium
GBS (genotyping by sequencing) data analysis and SNP calling
Bioinformatics @ IITA
Tools – Programs that process your data
![Page 31: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/31.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
Transcriptomics (RNA-seq) mapping, assembly and expression profiling
What is RNA-seq?
![Page 32: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/32.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
smallRNA data analysis: discovery and expression profiling
Automated pipeline for reference supported and de novo transcriptome assembly and expression profiling
![Page 33: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/33.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
smallRNA data analysis: discovery and expression profiling
Small RNA are short (21 -200nt) long RNA, not coding for proteins with gene regulatory effects.
![Page 34: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/34.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
smallRNA data analysis: discovery and expression profiling
Automated pipeline for non-coding RNA classification and expression profiling.
![Page 35: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/35.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
DNA methylation (BS-seq) data analysis
What is BS-seq?
DNA methylation is another gene regulation mechanism which can be inherited.
![Page 36: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/36.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
DNA methylation (BS-seq) data analysis
What is BS-seq?
DNA methylation is another gene regulation mechanism which can be inherited.
![Page 37: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/37.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Tools – Programs that process your data
DNA (shotgun) assembly and variation callingGenome annotation using different data pipelines and visualization
![Page 38: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/38.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Reference data – Databases for existing data
Genomic Reference Data
Cassava (sequence, annotation, function)D.rotundata (sequence, working on annotation and function)D.alata (waiting for sequence and annotation)Maize (ready sequence and annotation)Banana (ready sequence and annotation)
Archive
Cassava (GBS, WGS, RNA-seq)D.rotundata (GBS, smallRNA)Maize (GBS)
![Page 39: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/39.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Reference data – Databases for existing data
Genomic Reference Data
Cassava (sequence, annotation, function)D.rotundata (sequence, working on annotation and function)D.alata (waiting for sequence and annotation)Maize (ready sequence and annotation)Banana (ready sequence and annotation)
Archive
Cassava (GBS, WGS, RNA-seq)D.rotundata (GBS, smallRNA)Maize (GBS)
![Page 40: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/40.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
Reference data – Databases for existing data
Genomic Reference Data
Cassava (sequence, annotation, function)D.rotundata (sequence, working on annotation and function)D.alata (waiting for sequence and annotation)Maize (ready sequence and annotation)Banana (ready sequence and annotation)
Archive
Cassava (GBS, WGS, RNA-seq)D.rotundata (GBS, smallRNA)Maize (GBS)
![Page 41: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/41.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics @ IITA
INTERNET– connection to external Databases
Automated pipelines and strategies for big data downloads
![Page 42: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/42.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics & IITA
Development of Bioinformatics Capacity
IITA Projects
Involvement in planning of data production, analysis - financing of data storage and analysis
BioinformaticsBioscience
Data analysis, Data repositories, Visualization
![Page 43: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/43.jpg)
www.iita.orgA member of CGIAR consortium
Bioinformatics & IITA
Development of Bioinformatics Capacity
In project with sequencing activities:We need to individuate the bioinformatics part
We need to take over at least a part of the bioinformatics
activities
We have the Bioscience involved in the planning of the data
production to optimize the data analysis and knowledge building
Capacity building to enforce the bioinformatics facility
![Page 44: Bioinformatics at IITA](https://reader036.vdocuments.net/reader036/viewer/2022062904/587f9af21a28ab825e8b4e45/html5/thumbnails/44.jpg)
www.iita.orgA member of CGIAR consortium
Thank you!
Data from:
Ranjana Bhattacharjee
Livia Stavolone
Morag Ferguson
Ismail Rabbi