metagenomics and it’s applications

41
Metagenomic s COMPUTATIONAL BIOLOGY

Upload: shamsadiq

Post on 16-Apr-2017

781 views

Category:

Science


2 download

TRANSCRIPT

Page 1: Metagenomics and it’s applications

Metagenomics COMPUTATIONAL BIOLOGY

Page 2: Metagenomics and it’s applications

Metagenomics

Metagenomics is the study of genetic material recovered directly from environmental samples.

While traditional microbiology and microbial genome sequencing and genomics  rely upon cultivated clonal cultures, early environmental gene sequencing cloned specific genes to produce a profile of diversity in a natural sample

Page 3: Metagenomics and it’s applications

Cont..

Page 4: Metagenomics and it’s applications

Cont..

Page 5: Metagenomics and it’s applications

TWO APPROACHES FOR METAGENOMICS

In the first approach: Known as ‘sequence-driven metagenomics’, DNA from the

environment of interest is sequenced and subjected to computational analysis.

The metagenomic sequences are compared to sequences deposited in publicly available databases such as GENBANK.

The genes are then collected into groups of similar predicted function, and the distribution of various functions and types of proteins that conduct those functions can be assessed.

Page 6: Metagenomics and it’s applications

Cont,,

In the second approach: ‘Function-driven metagenomics’, the DNA extracted from the

environment is also captured and stored in a surrogate host, but instead of sequencing it, scientists screen the captured fragments of DNA, or ‘clones’, for a certain function.

The function must be absent in the surrogate host so that acquisition

of the function can be attributed to the metagenomics DNA.

Page 7: Metagenomics and it’s applications

LIMITATIONS OF TWO APPROACHES

The sequence driven approach limited existing knowledge: if a metagenomic gene does not look like

a gene of known function deposited in the databases, then little can be learned about the gene or its product from sequence alone.

The function driven approach most genes from organisms in wild communities cannot be expressed

easily by a given surrogate host

Page 8: Metagenomics and it’s applications

How it use in bioinformatics:

Sequence pre-filtering

The first step of metagenomic data analysis requires the execution of certain pre-filtering steps, including the removal of redundant, low-quality sequences and sequences of probable eukaryotic origin .

 The methods available for the removal of contaminating eukaryotic genomic DNA sequences include Eu-Detect and DeConseq.

Page 9: Metagenomics and it’s applications

Comparative metagenomics

Comparative analyses between metagenomes can provide additional insight into the function of complex microbial communities and their role in host health.

 Pairwise or multiple comparisons between metagenomes can be made at the level of sequence composition (comparing GC-content or genome size), taxonomic diversity, or functional complement.

Page 10: Metagenomics and it’s applications

Cont,,

Consequently, metadata on the environmental context of the metagenomic sample is especially important in comparative analyses, as it provides researchers with the ability to study the effect of habitat upon community structure and function.

Page 11: Metagenomics and it’s applications

Metagenomics

Page 12: Metagenomics and it’s applications

Bioinformatics for Whole-Genome Shotgun Sequencing

AUTHORS:

1. KEVIN CHEN 2. LIOR PACHTER

PUBLISHED: JULY 12, 2005 CITATION: 126

Page 13: Metagenomics and it’s applications

Shotgun Sequencing

Shotgun sequencing involves randomly breaking up DNA sequences into lots of small pieces and then reassembling the sequence by looking for regions of overlap.

Large, mammalian genomes difficult to clone(complex). Clone-by-clone sequencing, although reliable and methodical(time taking).

Used by Fred Sanger and his colleagues. To sequence small genomes such as those of viruses and bacteria.

fragments are often of varying sizes, ranging from 2-20kilobases to 200-300 kilo bases.

Page 14: Metagenomics and it’s applications
Page 15: Metagenomics and it’s applications

Advantages of shotgun sequencing:

By removing the mapping stages, much faster process than clone-by-clone sequencing.

Uses a fraction of the DNA that clone-by-clone sequencing needs. Efficient if there is an existing reference sequence. Easier to assemble the genome sequence by aligning it to an

existing reference genome?. Faster and less expensive than methods requiring a genetic map.

Page 16: Metagenomics and it’s applications

Disadvantages of shotgun sequencing

Vast amounts of computing power and sophisticated software are required to assemble shotgun sequences together.

Errors in assembly are more likely to be made because a genetic map is not used

Easier to resolve than in other methods and minimized if a reference genome can be used.

Carried out if a reference genome is already available, otherwise assembly is very difficult without an existing genome to match it to.

Repetitive genomes and sequences can be more difficult to assemble.

Page 17: Metagenomics and it’s applications

Assembling Communities

The assembly of communities has strong similarities to the assembly of highly polymorphic diploid eukaryotes, such as Ciona savigny  and Candida albicans.

If we view prokaryotic strains as analogous to eukaryotic haplotypes. The main difference is that in a microbial community, the number of strains is unknown

and potentially large, and their relative abundance is also unknown and potentially skewed, while in most eukaryotes we know a priori the number of haplotypes and their relative abundance.

This disadvantage is mitigated somewhat by the small size and relative lack of repetitive sequence in prokaryotic and viral genomes, so that the issue of distinguishing alleles from paralogs and polymorphism from repetitive sequence is less acute.

Page 18: Metagenomics and it’s applications

We performed similar calculations for the three whale fall communities. In addition, we considered the problem of assembling all genomes in these communities.

Since the 16S survey indicated that three dominant species constitute approximately half the total abundance and all other species have roughly equal abundance, the Lander–Waterman model implies that the expected coverage should be distributed as the mixture of two Poisons with equal weight.

The results of these calculations are summarized. Similar results were obtained by Venter et al. and Breitbart et al. , and bioinformatitions use different software's.

Page 19: Metagenomics and it’s applications
Page 20: Metagenomics and it’s applications

Whole genome shotgun sequencing guided by bioinformatics pipelines—an optimized approach for an established technique

Shotgun metagenomics sequencing allows researchers to comprehensively sample all genes in all organisms present in a given complex sample. The method enables microbiologists to evaluate bacterial diversity and detect the abundance of microbes in various environments. Shotgun metagenomics also provides a means to study unculturable microorganisms that are otherwise difficult or impossible to analyze.

Page 21: Metagenomics and it’s applications

Phylogeny and Community Diversity

Regards to community diversity, one of the advantages of the WGS approach is that it is less biased then PCR, which is known to suffer from a host of problems.

Community modeling based on analysis of assembly data within the Lander–Waterman model is beginning to show that species abundance curves are not lognormal as previously thought.

New methods that take into account these naturally occurring distributions are needed.

Page 22: Metagenomics and it’s applications

Conclusion

The number of new community shotgun sequencing projects continues to grow, promising to provide vast quantities of sequence data for analysis.

Samples are being drawn from macroscopic environments such as the sea and air, as well as from more contained communities such as the human mouth.

Exciting advances in our understanding of ecosystems, environments, and communities will require creative solutions to numerous new bioinformatics problems.

We have briefly mentioned some of these: assembly (can co-assembly techniques be used to assemble polymorphic genomes and complex communities?), binning (what is the best way to combine diverse sources of information to bin scaffolds?), gene finding (how should gene finding programs, which were designed for complete genes and genomes, be adapted for low-coverage sequence?), fingerprinting (which clustering techniques are best suited for discovering novel pathways and functional groups that allow communities to adapt to their environments?), and MSA and phylogeny (how can we best construct trees and alignments from fragmented data?).

Page 23: Metagenomics and it’s applications

Countless more challenges will likely emerge as WGS sequencing approaches are used to tackle increasingly complex communities.

The reward for computational biologists who work on these problems will be the satisfaction of contributing to the grand enterprise of understanding the total diversity of life on our planet. 

Page 24: Metagenomics and it’s applications

A5-miseq

Produces high quality microbial genome assemblies on a laptop computer without any parameter tuning. A5-miseq does this by automating the process of adapter trimming, quality filtering

Page 25: Metagenomics and it’s applications

Orione

A Galaxy-based framework consisting of publicly available research software and specifically designed pipelines to build complex, reproducible workflows for next-generation sequencing microbiology data analysis.

Enabling microbiology researchers to conduct their own custom analysis and data manipulation without software installation or programming, Orione provides new opportunities for data-intensive computational analyses in microbiology and metagenomics.

Page 26: Metagenomics and it’s applications

METAGENOMICS APPLICATIONS

• Successful products

• Antibiotics

• Antibiotic resistance pathways

• Anti-cancer drugs

•Degradation pathways – Lipases, amylases, nucleases, hemolytic

Page 27: Metagenomics and it’s applications

Cont..

• Transport proteins

• Ecology and Environment

• Energy

• Bioremediation

• Biotechnology

• Agriculture

• Biodefence

Page 28: Metagenomics and it’s applications

Applications

● Global Impacts. The role of microbes is critical in

maintaining atmospheric balances, as they are

the main photosynthetic agents responsible for the generation and

consumption of greenhouse gases involved at all levels in ecosystems

and trophic chains

Page 29: Metagenomics and it’s applications

Applications

Bioremediation. Cleaning up

environmental contamination, such as

● the waste from water treatment facilities

● gasoline leaks on lands or oil spills in the oceans

● toxic chemicals

Page 30: Metagenomics and it’s applications

Applications

● BioenergyWe are harnessing microbial power in

order to produce

● ethanol (from cellulose), hydrogen, methane, butanol...

● Smart Farming. Microbes help our crops by

● the “supressive soil” phenomenon (buffer effect against disease-causing organisms)

● soil enrichment and regeneration

Page 31: Metagenomics and it’s applications

Applications

The World Within. Studying the human microbiome

may lead to valuable new tools and guidelines in

● Human and animal nutrition

● Better understanding of complex diseases (obesity, cancer, asthma...)

● Drug discovery

● Preventative medicine

Page 32: Metagenomics and it’s applications

Applications

Mapping the human microbiome

Page 33: Metagenomics and it’s applications

Tools..

Page 34: Metagenomics and it’s applications

QIIME

QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data.

QIIME is designed to take users from raw sequencing data generated on the Illumina or other platforms through publication quality graphics and statistics.

QIIME has been applied to studies based on billions of sequences from tens of thousands of samples.

Page 35: Metagenomics and it’s applications

Mothur.

to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community

screening, processing, aligning & clustering of Sanger, 454 or Illumina (16S rRNA) amplicons

generating a high-quality, effectively ‘normalized’ shared file (i.e. counts of OTUs per sample)

gaining general taxonomic information about the OTUs in your study system (RDP Taxonomic Classifier)

Page 36: Metagenomics and it’s applications

MEGAN

 

In metagenomics, the aim is to understand the composition and operation of complex microbial consortia in environmental samples through sequencing and analysis of their DNA. 

Page 37: Metagenomics and it’s applications

MG.RAST

Page 38: Metagenomics and it’s applications

CONT..

Page 39: Metagenomics and it’s applications

FUTURE OF METAGENOMICS

• To identify new enzymes & antibiotics

• To assess the effects of age, diet, and pathologic states (e.g., inflammatory bowel diseases, obesity, and cancer) on the distal gut microbiome of humans living in different environments

Study of more exotic habitats

• Study antibiotic resistance in soil microbes

• Improved bioinformatics will quicken analysis for library profiling

Page 40: Metagenomics and it’s applications

Cont..

• Investigating ancient DNA remnants

• Discoveries such as phylogenic tags (rRNA genes, etc) will give momentum to the growing field

• Learning novel pathways will lead to knowledge about the current nonculturable bacteria to then culture these systems

Page 41: Metagenomics and it’s applications