sequencing, genome assembly and the sgn platform
DESCRIPTION
This talk was presented at IASRI Pusa on June 13th, 2014. Centre for Agricultural Bioinformatics Indian Agricultural Statistics Research Institute Library Avenue, Pusa, New Delhi - 110012 (INDIA) http://cabgrid.res.in/cabin/TRANSCRIPT
![Page 1: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/1.jpg)
Surya Saha, Ph.D. Cornell University & Boyce Thompson Institute
[email protected] @SahaSurya
Centre for Agricultural Bioinformatics Pusa, New Delhi
June 13,2014 Slides: http://bit.ly/CABin_Pusa_2014
http://www.acgt.me/blog/2014/3/7/next-generation-sequencing-must-die
Genome Assembly
Jason Chin http://www.bit.ly/SZPKIG
![Page 2: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/2.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 2
You are free to:
Copy, share, adapt, or re-mix;
Photograph, film, or broadcast;
Blog, live-blog, or post video of;
This presentation. Provided that:
You attribute the work to its author and respect the rights
and licenses associated with its components.
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with
permission from originals by Christopher Ross. Original images are available under GPL at
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
![Page 3: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/3.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 3
Sequencing
![Page 4: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/4.jpg)
19
53
DNA Structure discovery
19
77
20
12
Sanger DNA sequencing by chain-terminating inhibitors
19
84
Epstein-Barr virus
(170 Kb)
19
87
Abi370
Sequencer
19
95
20
01
Homo sapiens (3.0 Gb)
20
05
454
Solexa
Solid
20
07
20
11
Ion Torrent
PacBio
Haemophilus influenzae (1.83 Mb)
20
13
Slide credit: Aureliano Bombarely
Sequencing over the Ages
Illumina
Illumina Hiseq X
454
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 4
Pinus taeda
(24 Gb)
20
14
MinION
The Next Generation
![Page 5: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/5.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 5
Its all about the $£€¥
http://www.genome.gov/sequencingcosts/
![Page 6: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/6.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 6
First generation sequencing
![Page 7: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/7.jpg)
Sanger method
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 7
Frederick Sanger 13 Aug 1918 – 19 Nov 2013 Won the Nobel Prize for Chemistry in 1958 and 1980. Published the dideoxy chain termination method or “Sanger method” in 1977
http://dailym.ai/1f1XeTB
![Page 8: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/8.jpg)
Sanger method
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 8
http://bit.ly/1g6Cudq
http://bit.ly/1lcQO4J
![Page 9: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/9.jpg)
First generation sequencing
• Very high quality sequences (99.999%)
• Very low throughput
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 9
Run Time Read Length Reads / Run
Total
nucleotides
sequenced
Cost / MB
Capillary
Sequencing
(ABI3730xl)
20m-3h 400-900 bp 96 or 386 1.9-84 Kb $2400
http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd
![Page 10: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/10.jpg)
Use the specific technology used to generate the data
– Illumina Hiseq/Miseq/NextSeq
– Pacific Biosciences RS I/RS II
– Ion Torrent Proton/PGM
– SOLiD
– 454
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 10
http://www.acgt.me/blog/2014/3/10/next-generation-sequencing-must-diepart-2
![Page 11: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/11.jpg)
454 Pyrosequencing
One purified DNA fragment, to one bead, to one read.
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 11
http://bit.ly/1ehwxWN
GS FLX Titanium
http://bit.ly/1ehAcEh
![Page 12: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/12.jpg)
Illumina
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 12
Output 15 Gb 120 GB 1000 GB 1800 GB
Number of Reads
25 Million 400 Million 4 Billion 6 Billion
Read Length
2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014)
2x150 bp
Cost $99K $250K $740K $10M
Source: Illumina
$1000 human genome??
![Page 13: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/13.jpg)
Illu
min
a
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 13 http://1.usa.gov/1fP9ybl
![Page 14: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/14.jpg)
Illu
min
a: M
ole
culo
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 14
http://bit.ly/1aEPOBn
![Page 15: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/15.jpg)
Pacific Biosciences SMRT sequencing
Single Molecule Real Time sequencing
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 15
http://bit.ly/1naxgTe
![Page 16: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/16.jpg)
Pacific Biosciences SMRT sequencing Error correction methods
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 16
Hierarchical genome-assembly process (HGAP)
PB
Jelly
Enlish et al., PLOS One. 2012
PBJelly
![Page 17: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/17.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 17
Pacific Biosciences SMRT sequencing Read Lengths
![Page 18: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/18.jpg)
Oxford Nanopore
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 18
https://www.nanoporetech.com/
• No data yet??
• Error model
http://erlichya.tumblr.com/post/66376172948/hands-on-experience-with-oxford-nanopore-minion
![Page 19: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/19.jpg)
Others
• Ion Torrent Proton/PGM
• Nabsys
• SOLiD
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 19
![Page 20: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/20.jpg)
Comparison
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 20
![Page 21: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/21.jpg)
Next generation sequencing
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 21
Run Time Read Length Quality
Total
nucleotides
sequenced
Cost /MB
454
Pyrosequencing 24h 700 bp Q20-Q30 0.7 GB $10
Illumina Miseq 27h 2x250bp > Q30 15 GB $0.15
Illumina Hiseq
2500 11days 2x125bp >Q30 1000 GB $0.05
Ion torrent 2h 400bp >Q20 50MB-1GB $1
Pacific
Biosciences 2h 10-20kb
>Q30 consensus
>Q10 single
400-800MB
/SMRT cell $0.33-$1
http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd
![Page 22: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/22.jpg)
http://omicsmaps.com/
Next Generation Genomics: World Map of High-throughput Sequencers
Centre for Agricultural Bioinformatics, Pusa 6/15/2014 22
![Page 23: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/23.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 23
http://bit.ly/18pfUId
![Page 24: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/24.jpg)
Real cost of Sequencing!!
Sboner, Genome Biology, 2011
6/15/2014 24 Centre for Agricultural Bioinformatics, Pusa
![Page 25: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/25.jpg)
Library Types
Single end
Pair end (PE, 150-800 bp, Fwd:/1, Rev:/2)
Mate pair (MP, 2Kb to 20 Kb)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 25
F
F R
F R 454/Roche
F R Illumina
Illumina
Slide credit: Aureliano Bombarely
![Page 26: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/26.jpg)
Implications of Choice of Library
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 26 Slide credit: Aureliano Bombarely
Consensus sequence
(Contig)
Reads
Scaffold
(or Supercontig)
Pair Read information
NNNNN
Pseudomolecule
(or ultracontig)
F
Genetic information (markers)
NNNNN NN
![Page 27: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/27.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 27
Quality control: Encoding
http://bit.ly/N28yUd
Phred score of a base is: Qphred = -10 log10 (e)
where e is the estimated probability of a base being incorrect
![Page 28: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/28.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 28
Genome Assembly
![Page 29: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/29.jpg)
Whole Genome Shotgun Sequencing
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 29 Slide credit: cbcb.umd.edu
![Page 30: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/30.jpg)
Genome Sequencing Strategies
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 30 Slide credit: Aureliano Bombarely
![Page 31: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/31.jpg)
Genome Sequencing Strategies
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 31
International Human Genome Sequencing Consortium 2001
Overlap Layout Consensus
http://contig.wordpress.com/
cbcb.umd.edu
![Page 32: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/32.jpg)
De
Bru
ijn G
rap
h
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 32
![Page 33: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/33.jpg)
Ingredient for a Good Assembly
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 33
Slide credit: Mike Schatz
![Page 34: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/34.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 34
![Page 35: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/35.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 35
Bird Snake
![Page 36: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/36.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 36
• You have the expertise to install and run • You have the suitable infrastructure (CPU & RAM) to run the assembler • You have sufficient time to run the assembler • Is designed to work with the specific mix of NGS data that you have
generated • Best addresses what you want to get out of a genome assembly (bigger
overall assembly, more genes, most accuracy, longer scaffolds, most resolution of haplotypes, most tolerant of repeats, etc.)
The BEST?? Genome Assembler for YOU
http://haldanessieve.org/2013/01/28/our-paper-making-pizzas-and-genome-assemblies/
![Page 37: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/37.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 37
![Page 38: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/38.jpg)
Which technology to use??
• Microbial genomes
• Eukaryotic genomes
• Resequencing genomes
• RNAseq and other XXXseq methods
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 38
http://bit.ly/1ko9Kgh
![Page 39: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/39.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 39
SOL Genomics Network
![Page 40: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/40.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 40
![Page 41: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/41.jpg)
The SGN Team!!
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 41
Surya Saha, Tom Fisher-York, Hartmut Foerster, Suzy Strickler, Jeremy Edwards,
Noe Fernandez, Naama Menda, Aure Bombarely, Aimin Yan, Isaak Tecle
![Page 42: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/42.jpg)
SGN Website
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 42
http://solgenomics.net
![Page 43: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/43.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 43
Main web page (front page):
WEB ICONS
TOOL BAR
![Page 44: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/44.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 44
Main web page (front page):
TOOL BAR
(MENUS)
![Page 45: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/45.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 45
But the DATA also can be edited
Locus Locus Editor Data
Community Data Curation
![Page 46: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/46.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 46
You need • SGN account. • Activate submitter / Locus Editor privileges by SGN curator
Locus Locus Editor Data
![Page 47: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/47.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 47
Tools
![Page 48: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/48.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 48
Genome Browser: GBrowse
![Page 49: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/49.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 49
Genome Browser: JBrowse
![Page 50: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/50.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 50
![Page 51: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/51.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 51
CassavaBase
http://cassavabase.org/
Slide credit: Jeremy Edwards
![Page 52: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/52.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 52
NextGen Cassava Project
● Project: Adapt SGN database for Cassava Breeding
● Goal: Apply Genomic Selection to cassava breeding
● Predict breeding values from genotype information
● Shorten the breeding cycle
● Massive amounts of genotypic data (GBS)
● Phenotypic data
● Data management challenge
● Improve flowering
● http://nextgencassava.org
Slide credit: Jeremy Edwards
![Page 53: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/53.jpg)
SGN/Cassavabase behind the scenes
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 53
● Perl/Catalyst MVC Framework
● PostgreSQL Database
● Generic Model Organism Database (GMOD)
– Chado relational database schema
– GBrowse
– JBrowse
● R
– Experimental design
– QTL mapping
– Genomic selection Slide credit: Jeremy Edwards
![Page 54: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/54.jpg)
Objectives
Provide cassava breeders and researchers access to data and tools in a centralized, user-friendly and reliable database.
– Improve partner breeding program information tracking
– Streamline management of genotypic and phenotypic data
– Pipeline genotypic and phenotypic data through Genomic Selection prediction analyses
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 54 Slide credit: Jeremy Edwards
![Page 55: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/55.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 55
Genomic Selection
The 'training population' is genotyped and phenotyped to 'train' the genomic selection (GS) prediction model. Genotypic information from the breeding material is then fed into the model to calculate genomic estimated breeding values (GEBV) for these lines. From Heffner et al. 2009 Crop Sci. 49:1–12
Information from a majority of lines in the breeding population (the training set) is used to create the prediction model. The model is then used to predict the phenotypes of the remaining lines (the validation set), using genotypic information only. The results from the model are compared to the actual data to give the prediction accuracy. Image courtesy of Martha Hamblin, Cornell University
Flow diagram of a genomic selection breeding program. Breeding cycle time is shortened by removing phenotypic evaluation of lines before selection as parents for the next cycle. From Heffner et al. 2009 Crop Sci. 49:1–12
Slide credit: Jeremy Edwards
![Page 56: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/56.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 56
Data collection in the field
● Android tablets
● Field book app
– Jesse Poland's group at
USDA-ARS / Kansas
State University
Slide credit: Jeremy Edwards
![Page 57: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/57.jpg)
Cassava Trait Ontology
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 57
Kulakow et al. 2011
Kulakow et al. 2011
● Standard terminology ● Facilitate the sharing of information ● Allow users to query keywords related to traits
Slide credit: Jeremy Edwards
![Page 58: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/58.jpg)
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 58
Position available at Solgenomics
Cassavabase project
Plant Breeding + Bioinformatician
● Familiar with breeding
● Programming in Perl, R, SQL, Hadoop
● Linux
● Africa
● Genius
http://www.cassavabase.org/forum/posts.pl?topic_id=9
![Page 59: Sequencing, Genome Assembly and the SGN Platform](https://reader035.vdocuments.net/reader035/viewer/2022062616/54b1d81e4a79595f7b8b45b2/html5/thumbnails/59.jpg)
Thank you!! Questions??
6/15/2014 Centre for Agricultural Bioinformatics, Pusa 59