maker 2014 what it is where it’s been where it’s going

53
MAKER 2014 What It Is Where It’s Been Where It’s Going Daniel Ence Yandell Lab University of Utah

Upload: teigra

Post on 24-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

MAKER 2014 What It Is Where It’s Been Where It’s Going. Daniel Ence Yandell Lab University of Utah. What Are Annotations?. Annotations are descriptions of features of the genome Structural: exons, introns, UTRs, splice forms etc. Coding & non-coding genes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKER 2014What It Is

Where It’s BeenWhere It’s Going

Daniel EnceYandell Lab

University of Utah

Page 2: MAKER 2014 What It Is Where It’s Been Where It’s Going

What Are Annotations? Annotations are descriptions of features of the genome

Structural: exons, introns, UTRs, splice forms etc. Coding & non-coding genes

Annotations should include evidence trail Assists in quality control of genome annotations

Examples of evidence supporting a structural annotation: Ab initio gene predictions ESTs Protein homology

Page 3: MAKER 2014 What It Is Where It’s Been Where It’s Going

Secondary Annotation Protein Domains and Families

InterPro Pfam

GO and other ontologies Pathways

Page 4: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

Page 5: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

Page 6: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

Page 7: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 8: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 9: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 10: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 11: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 12: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 13: MAKER 2014 What It Is Where It’s Been Where It’s Going

Genome Project Overview

>Smg5MEVTFSSGGSSNASSECAIDGGTNRCRGLEPNNGTCILSQEVKDLYRSLYTASKQLDDAKRNVQSVGQLFQHEIEEKRSLLVQLCKQIIFKDYQSVGKKVREVMWRRGYYEFIAFVSUCCESS

Page 14: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKERAn annotation pipeline and genome-database management tool for “next-generation” genome projects

Page 15: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKERUser Requirements:

Can be run by a single individual with little bioinformatics experience

Page 16: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKERUser Requirements:

Can be run by a single individual with little bioinformatics experience

System Requirements: Can run on Linux or Mac OS X based systems

Page 17: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKERUser Requirements:

Can be run by a single individual with little bioinformatics experience

System Requirements: Can run on Linux or Mac OS X based systemsProgram Output:

Output is compatible with popular annotation tools like Web-Apollo and JBrowse

Page 18: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKERUser Requirements:

Can be run by a single individual with little bioinformatics experience

System Requirements: Can run on Linux or Mac OS X based systemsProgram Output:

Output is compatible with popular annotation tools like Web-Apollo and JBrowse

Availability: Free for the academic community (including source code)

Page 19: MAKER 2014 What It Is Where It’s Been Where It’s Going

Beyond de novo annotation

• mRNA-seq integration

• Integrating new evidence into existing databases

• Update/revise legacy annotation sets

Page 20: MAKER 2014 What It Is Where It’s Been Where It’s Going

Legacy Annotation Set 1 Legacy Annotation Set 2 Legacy Annotation Set n

new data

• Identify legacy annotation most consistent with new data• Automatically revise it in light of new data• If no existing annotation, create new one

current assembly

Beyond de novo annotation

Page 21: MAKER 2014 What It Is Where It’s Been Where It’s Going

Legacy Annotation Set 1 Legacy Annotation Set 2 Legacy Annotation Set n

new data

• Identify legacy annotation most consistent with new data• Automatically revise it in light of new data• If no existing annotation, create new one

current assembly

Beyond de novo annotation

Page 22: MAKER 2014 What It Is Where It’s Been Where It’s Going

Distributed Parallelization

• Supports Message Passing Interface (MPI), a communication protocol for computer clusters which essentially allows multiple computers to act like a single powerful machine.

Page 23: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 24: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 25: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 26: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 27: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 28: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 29: MAKER 2014 What It Is Where It’s Been Where It’s Going

Data throughput

Page 30: MAKER 2014 What It Is Where It’s Been Where It’s Going

What happened in 2013?

Page 31: MAKER 2014 What It Is Where It’s Been Where It’s Going

What happened in 2013? MAKER-P

Page 32: MAKER 2014 What It Is Where It’s Been Where It’s Going

What happened in 2013? MAKER-P

Plant

Page 33: MAKER 2014 What It Is Where It’s Been Where It’s Going

What happened in 2013? MAKER-P

Plant Parallelized

Page 34: MAKER 2014 What It Is Where It’s Been Where It’s Going

What happened in 2013? MAKER-P

Plant Parallelized Publication

Page 35: MAKER 2014 What It Is Where It’s Been Where It’s Going

What happened in 2013 Publication:MAKER-P: a tool-kit for the rapid creation, management, and quality control of plant genome annotations

Campbell, Law, Holt et al., Plant Phys. 2013

Page 36: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKER-P at iPlant Atmosphere

MPI enabled for parallel computation Maximum instance size 16 CPU http://www.iplantcollaborative.org

TACC Lonestar Supercomputer with 22,656 CPU MPI enabled for parallel computation Can complete entire rice genome in ~2 hrs (1,152

cores) 96 CPU per chromosome

Currently being integrated into the iPlant Discovery Environment http://www.iplantcollaborative.org

XSEDE https://www.xsede.org

Page 37: MAKER 2014 What It Is Where It’s Been Where It’s Going

Data throughputPerformance on Zea maize genome (~ 2Gb)

Page 38: MAKER 2014 What It Is Where It’s Been Where It’s Going

Pinus taeda

8,640 cpus on TACC ~37 hours with queue (runtime 14 hours 37 minutes) Throughput of > 1 Gb/hour

Page 39: MAKER 2014 What It Is Where It’s Been Where It’s Going

Assembly & Annotation at iPlant

Page 40: MAKER 2014 What It Is Where It’s Been Where It’s Going

Added to MAKER-P non-coding RNA support better repeat annotation better pseudogene annotation

Page 41: MAKER 2014 What It Is Where It’s Been Where It’s Going

non-coding RNA annotation

tRNAscan support Will run from inside MAKER Doesn’t install automatically

snoScan support Can supply data file for annotation Will run from inside automatically Doesn’t install automatically

Page 42: MAKER 2014 What It Is Where It’s Been Where It’s Going

Better Repeat Annotation In the past:

Custom Repeat library de novo generated RepeatModeler

Now: RepeatModeler, but better. Step-by-step guide available at:

http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction--Basic

To be automated in the future

Page 43: MAKER 2014 What It Is Where It’s Been Where It’s Going

What’s Coming in 2014? Expanded ncRNA support MAKER-EVM Expanded Augustus/bam support Better integration with iPlant’s Discovery

environment

Page 44: MAKER 2014 What It Is Where It’s Been Where It’s Going

Expanded ncRNA annotation More of a feeling than a to-do list lncRNAs

Page 45: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKER Evidence Modeler

Haas et al., Genome Biology 2008

Page 46: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKER Evidence Modeler

Cantarel et al., 2008; Holt and Yandell, 2010

Page 47: MAKER 2014 What It Is Where It’s Been Where It’s Going

MAKER Evidence Modeler

Cantarel et al., 2008; Holt and Yandell, 2010

EVM

Page 48: MAKER 2014 What It Is Where It’s Been Where It’s Going

Better Augustus support MAKER gives Augustus hints Augustus can take better hints from a

bam file Users will be able to supply a bam file in

the MAKER control file Bam files open up a world of possibilities!

Page 49: MAKER 2014 What It Is Where It’s Been Where It’s Going

Assembly & Annotation at iPlant

Page 50: MAKER 2014 What It Is Where It’s Been Where It’s Going

Future Annotations• Trichmonas

vaginalis• Pinus taeda• Apis dorsata• Cronartium

quercuum• Common Pigeon• Cardiocondyla

obscurior

• Southern right whale

• Tardigrade• Spotted Gar• Gibbon• Turkey• 9 spined

stickelback• Golden Eagle

Page 51: MAKER 2014 What It Is Where It’s Been Where It’s Going

Acknowledgements• I’d like to thank and recognize all contributions from Mark Yandell at the University of Utah,

as well as lab members Barry Moore, Michael Campbell, Daniel Ence, and former lab member Meiyee Law.

• Special thank you to Scott Cain, Robert Buels, and Amelia Ireland.• I would also like to recognize collaborators Ian Korf at UC Davis• MAKER-P and integration into iPlant

infrastructure:• Josh Stein (CSHL)• Kevin Childs (MSU)• Gaurav Moghe (MSU)• David Hufnagel (MSU)• Jikai Lei (MSU)• Rujira Achawanantakun (MSU)• Carolyn Lawrence (USDA-ARS CICGRU)• Doreen Ware (CSHL)• Shin-Han Shiu (MSU)• Yanni Sun (MSU)• Ning Jiang (MSU)• Matt Vaughn (TACC)• Dian Jiao (TACC)• Zhenyuan Lu (CSHL)• Nirav Merchant (U. Arizona)

• Pinus taeda genome project:• Jill Wegrzyn (UConn)• John Liechty (UC Davis)• Kristian Stevens (UC Davis)• Carol Loopstra (Texas A&M)• Hans Vasquez-Gross (UC Davis)• Brian Lin (UC Davis)• Matt Dougherty (UC Davis)• Jacob Zieve (UC Davis)• Pedro J Martinez-Garcia (UC Davis)• James A Yorke (U. Maryland(• Marc Crepeau (UC Davis)• Daniela Puiu (Johns Hopkins)• Steven L Salzberg (Johh Hopkins)• Pieter J. deJong (CHORI-BACPAC Resources Center)• Keithanne Mockaitis (Indiana University)• Dorrie Main (Washington State)• Chuck Langley (UC Davis)• David Neale (UC Davis)• MAKER-devel community

• Funding from the NHGRI through an RO1 grant entitled Software for the creation and quality control of genome annotations.

Page 52: MAKER 2014 What It Is Where It’s Been Where It’s Going
Page 53: MAKER 2014 What It Is Where It’s Been Where It’s Going

Get in Touch!Mailing List:maker-devel at yandell-lab.org

Download:http://yandell-lab.org/software/maker.html

Email me:dence at genetics.utah.edu