molecular biosciences

62
1 Identification and Characterization of Gene Functions Involved in Recalcitrant Compound Degradation using Metagenomic Data Södertörn University | Department of Life Sciences Bachelor Thesis 15 ECTS | Molecular Biology | Fall Term 2011 By: Tino Lawson

Upload: others

Post on 15-Jan-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Molecular Biosciences

1

Identification and Characterization

of Gene Functions Involved in

Recalcitrant Compound

Degradation using Metagenomic

Data

Södertörn University | Department of Life Sciences

Bachelor Thesis 15 ECTS | Molecular Biology | Fall Term 2011

Molecular Biosciences

By: Tino Lawson

Page 2: Molecular Biosciences

2

Södertörn University

Bachelor Thesis

Identification and Characterization of Gene Functions

Involved in Recalcitrant Compound Degradation using

Metagenomic Data

Molecular Biology Fall Term 2011

Tino Lawson

Supervisor: Ass. Prof. Sara Sjöling

Page 3: Molecular Biosciences

3

Abstract

With the environmental problems caused by man-induced pollution by

persistent toxic compounds, the importance of finding remediation solutions

is immense. As an emerging field, microbial environmental biotechnology

may provide the tools to achieve novel solutions. Microbial communities in

the environment have biodegradation capacities which could be, and

historically have been, exploited for bioremediation. The novelty lies in

being able to access the capacity of the uncultured majority of the microbial

community. Every day, more and more knowledge is gained in the field and

thanks to new approaches such as metagenomics, along with the access to

databases and archives where scientists share information and data, the

quest becomes considerably facilitated. Microorganisms are highly diverse

in metabolic pathways and some have become highly developed during

evolution; detoxification and biotransformation of naturally occurring toxic

compounds are therefore not novel concepts. The environmental problem

occurs when synthetically manufactured compounds are less efficiently

biodegraded. However, improved knowledge about the degradation

potential in nature and the involved enzymes may help in developing

bioremediation procedures. For this reason, an enzyme involved in catabolic

pathways of chlorinated aromatic compounds, dienelactone hydrolase,

which has been less well studied, was selected as a target. This study

investigated the biogeographical distribution of the dienelactone hydrolase

gene identified in metagenomes sampled from different environments

globally in order to detect potential environmental patterns. Results may

cast light on its significance for degradation of chlorinated aromatic

compounds in nature. The results indicate a broad biogeographical

distribution of dienelactone hydrolase in varying microbial habitats in the

environment. The enzyme was found in environments ranging from water

and soil habitats to hypersaline-, dechlorinating-, hot-spring- and other

extremophillic habitats, in which the gene sequences shared high similarity

within each group. A broad environmental distribution suggests that

dienlactone hydrolase could be useful in bioremediation.

Page 4: Molecular Biosciences

4

Key words: dienelactone hydrolase, recalcitrant, biodegradation, bioremediation,

metagenomics

“Perfection is achieved, not when there is nothing more to add, but when there is nothing left

to take away” -Antoine de Saint-Exupéry

Page 5: Molecular Biosciences

5

Table of Contents

I. Introduction ........................................................................................ 6

II. Objectives ........................................................................................ 10

III. Background & Literature review ................................................... 11

3.1 Biodegradation ............................................................................ 11

3.2 Halocarbons ................................................................................ 12

3.3 Dienelactone Hydrolase .............................................................. 13

3.4 Metagenomics ............................................................................. 14

3.5 Bioremediation for biotechnological applications ..................... 16

IV. Methodology .................................................................................. 18

4.1 Integrated Microbial Genomes (IMG) ........................................ 18

4.2 Protein data bank (PDB) ............................................................. 19

4.3 Basic Local Alignment Search Tool (BLAST) .......................... 19

4.4 Clustal W..................................................................................... 20

4.5 Construction of phylogenetic trees ............................................. 20

4.6 Methodological strengths & limitations ..................................... 21

V. Results ............................................................................................. 24

VI. Discussion ...................................................................................... 26

VII. Concluding remarks ..................................................................... 31

VIII. Acknowledgments ...................................................................... 32

IX. Appendix ....................................................................................... 33

X. References ....................................................................................... 58

Articles & Literature ...................................................................... 58

Figures-references .......................................................................... 61

Databases ........................................................................................ 62

Page 6: Molecular Biosciences

6

I. Introduction The production of anthropogenic synthetic compounds for various purposes

is inevitable in our industrialized world, and even though needed and

therefore manufactured by the chemical industry, the same compounds can

pose a significant hazard both to the environment and to human health

(Klöppfer, 1994). This is due to the properties (i.e. lipophilicity or chemical

stability) particularly if these compounds are xenobiotic. For example,

various substitutions could be added to a naturally occurring aromatic

compound, making it toxic or resulting in the rise of toxic components

during biodegradation (Atlas & Bartha 1998, p. 511). Consequently,

humans but also animals, which are at the highest trophic level, become

subjects to bioaccumulation and biomagnifications (Ibanez et. al. 2007,

p.232). Therefore, although not necessarily exposed directly to toxic

compounds, great damage can be caused (the latter can be severely affected

as mentioned previously) even though the environmental concentrations of

these compounds are not at the level of toxicity for the ecosystem in that

particular area.

The majority of the synthetic compounds share enough similarity with

natural compounds so as to undergo microbial metabolism/degradation, i.e.

bioremediation (Dagley 1975, Atlas & Bartha 1998, p. 512). On the other

hand, recalcitrant xenobiotics have structures (and chemical bonds) that are

not recognized by the enzymes responsible for the degradative stages of the

microbial metabolism (Singh & Dwivedi 2004, p.60) and therefore resist

biodegradation, or are incompletely degraded, which leads to the

accumulation of these compounds in the environment. The main reasons

why a compound is recalcitrant are the presence of unusual substitutions

such as halogens (chlorine being the best example), unusual bonds, highly-

condensed aromatic rings and excessive molecular size/weight (Juhasz &

Naidu, 2000).

Microorganisms may use xenobiotic compounds as a source of energy,

carbon, nitrogen or sulfur and therefore degradation of many xenobiotic

chemicals requires microbial communities (Fetzner, 2001). Millions of

Page 7: Molecular Biosciences

7

organic compounds are derived from natural biosynthetic pathways that are

present in animals, plants and microorganisms along with industrial

synthesis. Microorganisms not only generate, but they also degrade some of

these compounds as a part of their key role in the biogeochemical cycles of

the Earth (Madsen, 2011).

In addition to biodegradation, there are other processes that contribute to the

general features of microbial degradation of xenobiotics i.e.

biotransformation and co-metabolism. These processes are of great

ecological significance. Taking advantage of the microbial capabilities in

order to safely remove toxic and persistent compounds from the

environment is what biotechnological bioremediation encompasses.

In a historical perspective, the accumulation of recalcitrant compounds

correlates with increased industrialism, implying that the Industrial

Revolution is one of the main reasons that the world has experienced an

increased global pollution ratio and a greater proportion of contaminated

sites. In this text, the term ‘contaminated site’ is used to describe a specific

biogeographical area that is polluted by a specific recalcitrant compound.

With continued demand for products from the industries, there is no

evidence that the rate at which environmental sites are contaminated will

decrease in the near future. On the contrary, an expanding global population

will put strain on the environment. Due to an expected increase in demand,

the industrial synthesis of toxic compounds will most likely continue.

Therefore, novel solutions are needed to tackle these environmental

problems.

The contaminated sites could potentially be ‘cleaned up’ by controlled and

contained bioremediation. When selecting a bioremediation technique, the

limiting factors of degradation must be carefully assessed for each

contaminated site because each site is unique.

A first step in doing this could be the exploration of the genetic potential for

a specific bioremediation capacity present among the microbial

communities in the environment. An alternative to the classical PCR based

Page 8: Molecular Biosciences

8

genetic studies is the analysis of genetic data of the total microbial

community, known as metagenomics. Metagenomic data differs from

genomic data in that it gives genetic information about an entire microbial

community, as opposed to just one microbial organism. This is very

important because 16S rRNA gene sequence analysis has suggested that 26

microbial phyla have no cultured representatives (Rappe & Giovannoni

2003). Thus, in relation to this study, the information retrieved from

metagenomics is much more valuable as it is independent of previously

known sequence information.

Part of reaching these goals is to consider the natural biodegradability

capabilities of microbial consortia and, with the aid of metagenomic

databases to acquire more knowledge about the diversity of a catabolic gene

within and among many different biogeographical areas.

According to Juhasz & Naidu, 2000, it has been shown that there are

microorganisms which are able to degrade recalcitrant compounds, but

which may not be prevalent in soils where remediation is necessary (Juhasz

& Naidu, 2000). Consequently, it is of great value to gain insight into the

biogeographical distribution of such catabolic genes in order for us to be

able to develop optimal remediation solutions.

In this study, the capacity of environmental habitats for biodegradation was

investigated by analyzing the presence of a selected catabolic gene

(involved in the degradation of recalcitrant compounds) to determine

whether it could be detected in any or many types of environments. In this

way we can gain more knowledge about the presence of catabolic enzymes

and recalcitrant compounds within a complex microbiome in the

environment. We thereby may also learn about the dynamics of the

bioremediation capacity in these environments in order to make realistic

interpretations of real-life conditions when developing solutions. This could

allow us to circumvent the limiting factors of bioremediation in a given

environment. Specifically, the presence and diversity of an enzyme able to

degrade halocarbons was investigated, the dienelactone hydrolase. This

gene function was selected as it is important to find bioremediation

Page 9: Molecular Biosciences

9

solutions for sites contaminated with halocarbons because they pose a threat

for the reasons mentioned earlier. Three central research questions were

formulated, which seek to answer (i) whether there is a pattern of how

dienelactone hydrolase is biogeographically distributed in the environment,

(ii) how dienelactone hydrolases from different strains are related, and (iii)

if and how dienelactone hydrolase can be applicable in bioremediation

techniques involving recalcitrant compounds.

Page 10: Molecular Biosciences

10

II. Objectives In order to answer the research questions mentioned above, the primary

objective of this study was to identify and characterize gene functions

involved in the degradation of recalcitrant compounds using available global

metagenomic data. Specifically, an enzyme involved in the degradation of

chlorinated aromatic compounds (i.e. dienelactone hydrolase) was

investigated. Dienelactone hydrolase operates exclusively in the catabolic

pathway of chlorocatechol. Hence, the study has focused on this catabolic

pathway, also known as the modified ortho-cleavage pathway. Through

phylogenetic analyses the aim was to reveal the evolutionary relationship of

the gene between different strains or species and to investigate if a

biogeographical distribution of the gene could be detected through the

comparison of metagenomes from different environments.

Page 11: Molecular Biosciences

11

III. Background & Literature review

3.1 Biodegradation

The term biodegradation refers to the microbial process of breaking down

organic compounds into biomass, water, carbon dioxide, and the oxides or

mineral salts of other elements present. Mineralization refers to the

complete degradation of an organic compound into inorganic components.

However, under normal circumstances, mineralization also involves the

formation of biomass. Eventually, biomass will also undergo mineralization.

When an organic compound is broken down to a less complex organic

compound, the process is known as incomplete/partial degradation (Atlas &

Bartha, 1998).

Many factors affect the performance of the biodegradation process, such as

the physical and chemical properties of the environment. Major causes for

the persistence of many xenobiotics are e.g. sorption to soil and sediment,

micropore entrapment etc (Elzerman & Coates 1987). There are other

parameters that influence the extent and the rate at which biodegradation

takes place. For example, the bioavailability of the substrate plays a key role

in this. Chemical structure, concentration of substrate, environmental

conditions (pH, temperature, salinity, presence of inhibitory molecules,

availability of nutrients/oxygen/electron donors/electron acceptors),

composition and size of consortium (which microorganisms are present),

and the physicochemical properties of the environment all affect the

bioavailability of the substrate (Paul & Clark, 1989). Furthermore,

bioavailability is also controlled by the physical state, solubility and binding

affinity (to soil or sediment particles) of the xenobiotic compound.

Ultimately, as mentioned, sorption and entrapment in micropores are major

causes for the persistence of many xenobiotics (Elzerman & Coates 1987).

Numerous xenobiotics are toxic to organisms at high concentrations. This

includes the degradative microorganisms. However, it is important to keep

in mind that a minimum concentration of xenobiotic is required in order to

induce the synthesis of the catabolic genes responsible for the degradation

process (Fetzner 2001). It is also important to remember to examine the

Page 12: Molecular Biosciences

12

products formed after a biodegradation processes, and not only the

disappearance of a compound. This implies that the disappearance of a

compound from its environment does not always mean that it has been

biodegraded. Moreover, biodegradation rates vary from compound to

compound, ranging from days to decades. A good example of a very

persistent xenobiotic is the insecticide DDT (1,1,1-trichloro-2,2-bis[p-

chlorophenyl]ethane).

Such recalcitrant compounds, as the halocarbons in focus in this study, have

natural counterparts that have similar structures, implying that there are

catabolic enzymes that could degrade recalcitrant compounds. Dienelactone

hydrolase is such an enzyme, so learning about it might help when

developing new remediation strategies. This study provides us with

information about which kinds of environmental habitats dienelactone

hydrolase is present in, casting light on which type of consortia the gene is

active in.

3.2 Halocarbons

The carbon-halogen bond is highly stable and a great amount of energy is

needed to break the bond which makes halocarbons chemically stable (Atlas

& Bartha 1998, p. 514). Halocarbons is a large group comprising solvents,

refrigerants, and haloaromatics e.g. chlorobenzenes, chlorophenols,

chlorobenzoates and polychlorinated biphenyls. It is common that the

aerobic biodegradability of haloaromatics decreases with the number of

substituents. However, dechlorination of the same compounds under

anaerobic conditions occurs relatively easily (U.S. Environmental

Protection Agency 2000). Various Pseudomonas strains use dioxygenases to

aerobically convert mono- and dichlorobenzenes to chlorocatechols, and

they do so with ease (Potrawfke et.al. 1998). The aerobic degradative

pathways of haloaromatics converge at chlorosubstituted catechols (Fig.1).

Chlorosubsituted catechols are converted to 3-oxoadipate in a similar way

seen in catechol metabolism. Catechol metabolism is also known as the

ortho-cleavage pathway; chlorosubstituted catechols are converted to 3-

oxoadipate via the modified ortho-cleavage pathway. It is designated as

modified ortho-cleavage pathway due to differences in substrate

Page 13: Molecular Biosciences

13

specificities, and that the dienelactone hydrolases of this pathway do not

convert 3-oxoadipate enol-lactone, which is an intermediate in catechol

catabolism (Schlömann 1994).

3.3 Dienelactone Hydrolase

Dienelactone hydrolase (DLH) is an enzyme present in many prokaryotes

(bacteria and archea) and in a few eukaryotes (fungi) (Pathak & Ollis,

1990). DLH catalyses the hydrolysis of dienelactone. Dienelactone is a

cyclic ester, and DLH converts it to maleylacetate following the ring

cleavage reaction. DLH is part of the β-ketoadipate catabolic pathway,

which functions as a biodegradative pathway for toxic aromatic compounds.

This pathway results in intermediates of the tricarboxylic acid cycle i.e.

acetyl CoA and succinyl CoA (Figure 1).

All DLHs contain the catalytic triad amino acid residues Cys-His-Asp, and

it is special because it is the only known non-synthetic enzyme that contains

this triad (Beveridge & Ollis, 1994). Inhibitor binding studies suggest that

dienelactone is held in the active site by hydrophobic interactions around the

lactone ring and the ion pairs between its carboxylate and Arg-81 and Arg-

206. The catalysis probably involves the formation of covalently bound acyl

intermediate via a tetrahedral intermediate (Cheah et al., 1993).

There are 3 types of dienelactone hydrolase; type I, type II and type III.

Dienelactone hydrolase I hydrolyses trans-dienelactone faster than cis-

dienelactone. Type II hydrolyses cis-dienelactone faster than trans-

dienelactone. Type III hydrolyses both (Schlömann et al. 1990). Three genes

code for dienelactone hydrolase: ClcD, TcbE and TfdE, located on the

plasmid Pjp4 (Schlömann 1994).

Pseudomonas sp. B13 is most suitable to use as an example because it

contains both the ortho-cleavage pathway and the modified ortho-cleavage

pathway. In Pseudomonas sp. B13, DLH is a monomeric protein with 236

amino acid residues and a molecular weight of 25,500 Da. It is made up of

seven helices and eight β-sheets; therefore it is an α/β protein (Pathak &

Ollis, 1990). Fig.2 illustrates the structure of DLH, retrieved from the

Page 14: Molecular Biosciences

14

Protein Data Bank (PDB). Fig.1 illustrates the modified ortho-cleavage

pathway and the enzymes involved.

The Enzyme Comission number of dienelactone hydrolase is EC:3.1.1.45.

The Clusters of Orthologous Groups number is COG0412. Its Locus is

NP_743344.

In a study conducted by Schreiber (et al., 1980), it was found that in the

catabolism of 4-fluorobenzoate by Pseudomonas sp. B13, chlorocatechol

1,2-dioxygenase and chloromuconate cycloisomerase had no activities. This

observation suggests that, in this particular strain, only dienelactone

hydrolase and maleylacetate reductase are required in the catabolism of 4-

fluorobenzoate (along with the enzymes for benzoate catabolism). This led

to the conclusion that there are some bacteria which have both dienelactone

hydrolase and maleylacetate reductase activities, but lack the other 2

enzymes that are necessary for complete chlorocatechol degradation

(Schreiber et al. 1980). In another study, it was confirmed that the catechol

catabolic pathways shared common enzymes (with similar substrate

specificities) with the chlorocatechol pathways. These enzymes are catechol

1,2-dioxygenase and muconate cycloisomerase, chlorocatechol 1,2-

dioxygenases and chloromuconate cycloisomerase, respectively, and they

are genetically very similar. On the other hand, 3-oxoadipate enol-lactone

hydrolases (from the catechol pathway) and dienelactone hydrolases (from

the chlorocatechol pathway) are not similar genetically and cannot use the

same substrate (Schlömann, 1994). The observations from these studies

suggested that DLH was derived from a preexisting pathway.

3.4 Metagenomics

Metagenomics is a set of research techniques and a research field,

originating from experiences from genomics. In metagenomics, instead of

investigating genomes of individual organisms as in genomics, the genomes

of all the individuals in a microbial community of a particular environment

are analyzed simultaneously (Handelsman, 2004). This eliminates several

problems common in traditional clinical and environmental microbiology.

Page 15: Molecular Biosciences

15

First, microbes exhibit great genomic diversity, and second, many microbes

cannot be cultured in the laboratory. It is known that merely 0.1-10% of all

microorganisms can be cultured in the laboratory (Amann et al., 1995). This

is why genomics of cultured isolates is insufficient.

Metagenomics includes the analyses of the total microbial community

genomes. Generally, there are two approaches, sequence based analyses or

activity based analyses. Sequence based screening provides sequence data

on the total genomic content of the microbial community analyzed, whereas

the activity based approach can detect novel gene products and enzymatic

activities expressed by genes that may otherwise not have been identifiable

based on prior sequence knowledge (Riesenfeld et al. 2004). This study will

draw on information from the sequence based approach. The first step in

metagenomics is to extract total community DNA directly from the

microbes living in a particular environment. The total community DNA, or

the metagenome, is sequenced using high through put sequencing.

The sequence based approach can entail complete sequencing of clones

containing phylogenetic anchors that indicate which taxonomic group the

DNA fragment probably belongs to. Another option is to perform random

sequencing, and once a gene of interest is identified, "phylogenetic anchors

can be sought in the flanking DNA to provide a link of phylogeny with the

functional gene" (Zeyaullah et.al. 2009). The steps involved in a typical

sequence based metagenome project are sample processing, sequencing

technology, assembly, binning, annotation, statistical analysis and data

storage and sharing (Thomas et al., 2012) (Figure 3). Sample processing is

the first and most important step because extracted DNA should be

representative of all cells present in the sample. Sequencing technologies

include next-generation sequencing (NGS). Binning is the process of sorting

DNA sequences into groups that could characterize an individual genome or

genomes from closely related organisms. For genome annotation, there are

existing pipelines such as Integrated Microbial Genomes (IMG) (Markowitz

et al., 2009). In this study, metagenomic data from different habitats

provided by other studies was accessed from the IMG/M database.

Dienelactone hydrolase or homologs were searched for in different

Page 16: Molecular Biosciences

16

metagenomes and when detected, the environment from which the

metagenomes came from i.e. the microbiomes, were identified in order to

conduct a biogeographical investigation based on phylogenetic analyses.

The activity based approach relies on cloning of the extracted DNA into

metagenomic libraries which are screened for expression of particular traits

(Stein et al., 1996). For example, an enzyme activity for degrading a

recalcitrant compound can be screened for (Lorenz et al., 2002). The

activity based metagenomics approach was less suitable for this study and

not used.

3.5 Bioremediation for biotechnological applications

In the field of biotechnology, the microbiological degradation processes

discussed in the introduction can be taken advantage of in order to remove

toxic compounds from the environment. These processes are collectively

known as bioremediation. Bioremediation can be performed in numerous

ways, either by single isolated microorganisms, by enzymes isolated from

microorganisms or by a natural consortium of soil microbial communities

making the method not only innovative but also more environmentally

oriented because the microorganisms used occur naturally in soil and

groundwater environments (Reineke et al., 2011). Additionally,

bioremediation if optimal does not result in the creation of wastes, and, it

does not require expensive equipment or heavy labour (U.S. Environmental

Protection Agency, 2000).

Scientists working with bioremediation aim to optimize environmental

factors for optimizing degradation conditions in order to speed up the rates

at which biodegradation takes place. In doing so, it allows for the

proliferation of the appropriate types of microorganisms, which results in a

copious and diverse distribution of these microorganisms (Philp et. al.,

2009).

Bioremediation techniques include intrinsic-, in situ-, ex situ- and phyto-

remediation. For easily degraded contaminants, biostimulation has been

Page 17: Molecular Biosciences

17

successful in both contaminated soil (Machackova et al., 2008) and marine

environments (Harayama et al., 1999). It involves the addition of suitable

electron donors, electron acceptors or nutrients (Reineke et al., 2011). Dybas

et al. have conducted bioaugmentation studies which have been successful

(Dybas et al., 2002). Bioaugmentation refers to the application of specific

microorganisms to contaminated sites in order to enhance the biological

activity of the indigenous populations (Pepper et. al., 2002).

Bioaugmentation is used in cases where intrinsic bioremediation does not

work, and is usually only applied to highly recalcitrant compounds.

Recalcitrant compounds can obviously contaminate many different types of

sites, some of which would not directly affect human health. On the other

hand, it is vital that once certain types of sites (which humans are in close

association with (e.g. groundwater)) have been contaminated, efficient

methods are used to rapidly clean-up the site. Previous studies of

contaminated groundwater in the U.S. have investigated an interception

technology known as Permeable Reactive Biobarriers (PRB), which have

proven to be efficient for contaminated groundwater (Puls et al., 1999). This

is positive because groundwater is one of the environments which could, if

contaminated, potentially threaten an entire population.

Each of the methods described above have advantages and disadvantages

depending on the type of contaminated site. This means that a careful

assessment needs to be made before implementation of bioremediation

strategies and techniques on a given contaminated site. Methods that have

previously been used successfully and have proved to be effective tools for

e.g. oil spills involve the application of nutrients along with well-adapted

microorganisms to a particular environment. In one study specifically,

bioremediation of oil-spilled sites in the open environment proved to be

successful through seeding of naturally adapted Pseudomonas putida strains

and the addition of fertilizer. The results of this study indicated an increased

viable count and degradation capacity of the inoculums (Raghavan &

Vivekanandan, 1999).

Page 18: Molecular Biosciences

18

IV. Methodology For this study to be feasible, sequences obtained from cultured organisms do

not provide sufficient information for the overall analysis since the number

of sequenced DLHs of cultured organisms of different habitats is limited.

Therefore, sequences were retrieved from a metagenomic database. The data

had to be retrieved in the form of amino acid sequences so that a

phylogenetic analysis could be performed. As a first step in this

comparative analysis of sequences, a query FASTA amino acid sequence

corresponding to dienelactone hydrolase (DLH) was retrieved from

GenBank at NCBI. The sequence was selected from a group of organisms

where the presence of both a normal- and a modified- ortho-cleavage

pathway was first discovered, which was in a Pseudomonas strain (sp.B13).

In order to retrieve FASTA amino acid sequences of DLH of metagenomic

data, several databases were available including the database for integrated

microbial genomes with microbiome samples or IMG/M

(img.jgi.doe.gov/cgi-bin/m/main.cgi), and MG-RAST

(http://metagenomics.anl.gov/). These two databases were the obvious

options. In this study, IMG/M was the metagenomic database of choice due

to reasons depicted under 4.6.

Additionally, other databases or tools were used i.e. the Protein Data Bank

(www.pdb.org), the Basic Local Alignment Search Tool or BLAST

(blast.ncbi.nlm.nih.gov/Blast.cgi), and Clustal W (from 2 websites cited in

the reference list), which is used for multiple sequence alignments and to

create phylogenetic trees.

4.1 Integrated Microbial Genomes (IMG)

IMG is a tool database for analysis and annotation of genome and

metagenome datasets. In this study, it was used for comparative analysis of

protein sequences. The database contains microbial genome and microbial

metagenome data, providing information about gene content and functional

capabilities (Technical report 2008). Using this database, dienelactone

Page 19: Molecular Biosciences

19

hydrolase was searched for within all available microbiomes using both the

COG number (Clusters of Orthologous Groups number) and by simply

inserting the name of the gene product, in this case dienelactone hydrolase.

Using the amino acid FASTA sequence of Pseudomonas as a query

sequence, BLASTp against selected microbiomes was performed, which

stands for Basic Local Alignment Search Tool, the details of which will be

explained shortly. The query sequence consisted of 265 amino acid long

DLH gene from Pseudomonas putida KT2440 and had accession number:

NP_743344 in GenBank.

Using the gene product name and COG number (dienelactone hydrolase and

COG0412 respectively), homologs were searched for within the selected

microbiomes. In this study, all microbiomes available at the time were used

because it was possible. This generated a list of the microbiomes and the

number of genes found within the microbiomes and further gave access to

more details and information on the individual sequences. From here,

corresponding FASTA amino acid sequences could be retrieved for use in

Clustal W.

4.2 Protein data bank (PDB)

PDB is a database/archive used to obtain basic information about biological

macromolecules (proteins and nucleic acids), such as structural data,

taxonomy information and it is also useful for generating 3D putative

structures of proteins. The database contains structures of proteins

determined by nuclear magnetic resonance (NMR) and x-ray

crystallography (Berman et al. 2000). The three dimensional protein

structure of the 3D crystal structure of DLH from Pseudomonas putida

KT2440 was retrieved from PDB and is shown in Figure 2.

4.3 Basic Local Alignment Search Tool (BLAST)

BLAST is used in molecular biology to compare a query sequence with

other sequences that can be found in protein or nucleotide databases. In

doing so, information about similarities and differences between the

Page 20: Molecular Biosciences

20

sequences is obtained. It is the best available tool for homology analysis

(Altschul et.al. 1990). In this study, once the amino acid FASTA sequence

of Pseudomonas was collected from GenBank at NCBI, it could be inserted

into IMGs database by performing a BLASTp (BLAST for proteins) against

selected microbiomes. This generated a list of homologs from the selected

microbiomes with varying percent identity to the query sequence. From this

list, FASTA sequences were selected under the following parameters: the

selected sequences (i) shared at least 30% amino acid identity with the

query sequence, and (ii) were at least 200 amino acids long.

4.4 Clustal W

Clustal W is a multiple sequence alignment tool. It uses multiple alignments

to detect patterns and homology in order to typify protein families. In

addition, it could for example also be used as a first step in the prediction of

secondary structures of new sequences (Thompson et al. 1994). After

selecting FASTA sequences based on satisfactory amino acid similarity and

appropriate sequence length, these were inserted into Clustal W in order to

carry out a phylogenetic analysis.

4.5 Construction of phylogenetic trees

Clustal W was used at 2 different websites (cited in the reference list) in

order to align the selected sequences depending on sequence similarities

before creating the dendrograms. There are several methods that can be used

when analyzing similarities and for constructing trees showing

relationships, or evolutionary distances as in a phylogenetic tree (Saitou &

Imanishi 1989). Two popular methods that are widely used are distance- and

character-based methods, also known as phenetic and cladistic methods

respectively. In phenetic methods, the pair-wise dissimilarity of genes is

measured whereas in cladistic methods, the best out of several possible trees

is chosen that is most suitable for evolutionary distance prediction (Duncan

et.al. 1980). In this study, the phylogenetic analysis was based solely on

distance based methods. Distance-based methods comprise the unweighted

pair group method with arithmetic mean (UPGMA) and the neighbour

joining (NJ) method.

Page 21: Molecular Biosciences

21

There are several ways of contructing phylogenetic trees, all of which are

slightly different in their method/mechanism. For this reason, it is

imperative to understand what the dissimilarities are between the types of

trees when making interpretations. Both the UPGMA and the NJ methods

can be applied to construct either a rooted or an unrooted tree. The

difference between rooted and unrooted trees is that rooted trees show the

common ancestor of the species under study i.e. the evolutionary path,

whereas unrooted trees only show the relationship among the species (Page

& Holmes 1998). When constructing the five phylogenetic trees in this

study, all the methods above were applied in an attempt to increase the

validity of the interpretations in the final analysis.

The trees were constructed at the following websites:

- http://pir.georgetown.edu/pirwww/search/multialn.shtml

- http://www.genome.jp/tools/clustalw/

These two websites were the only available websites that could generate

dendrograms with a scale and a value for each branch length. For this

reason, these 2 websites sufficed.

4.6 Methodological strengths & limitations

The methods described above have limitations and restrictions. Whilst these

methods can be efficient or suitable when applied to appropriate studies, the

general limitations will be described as well as the limitations that could

ultimately affect the results of this study.

Having access to a variety of databases is nothing but advantageous because

they provide the tools for scientists and researchers to assemble and utilize

otherwise inaccessible data. The metagenomic databases IMG/M and MG-

RAST are both very good and useful tools and they have similarities and

differences. For example, they are similar in binning methods but differ in

feature prediction. IMG/M and MG-RAST use similarity-based binning

algorithms (as opposed to compositional-based binning algorithms). This

algorithm enables an unknown fragment of DNA to be matched in a

reference database in order to classify (and hence bin) the sequence

Page 22: Molecular Biosciences

22

(Thomas et.al. 2012). Compositional-based binning is not reliable for small

fragment reads, as is the case with the 236 amino acid long DLH sequence.

Feature prediction is the process of labeling sequences as genes or genomic

elements and identifying protein coding sequences (CDS). Feature

prediction varies slightly between IMG/M and MG-RAST. MG-RAST and

IMG/M both provide a standardized pipeline, but the latter with "higher"

sensitivity as it performs. As described by Thomas (et al., 2012), IMG/M is

also the only system that integrates all datasets into a single protein level

abstraction. For these reasons, IMG/M was used in this study.

The other limitations in this study are leaning on the choice of methods for

constructing the phylogenetic trees, because the different methods could

result in different interpretations. Moreover, the use of certain methods (e.g.

character-based methods) in this study is currently not feasible, due to time-

and labour- constraints.

Generally, distance-based methods are more rapid than character based

methods, and are not as time- and labour-consuming (Page & Holmes

1998). However, distance based methods lack or discard supplementary

information (i.e. individual substitutions among sequences) needed to

produce an accurate tree that predicts the most probable evolutionary

relationship (Bruno et al. 2000).

As mentioned in 4.5, distance based methods include the UPGMA and the

NJ method. The UPGMA algorithm does not mirror evolutionary descent

because it assigns equal weight on the distance and assumes a randomized

molecular clock (Backeljau et al. 1996). The NJ method is rapid and gives

according to application better results than the UPGMA method; it does not

adjust for the rate variation among branches. The disadvantages of the NJ

method are that it creates just one tree, at the same time neglecting other

possible trees, and, it can create a biased tree due to errors in distance

estimates which become exponentially larger for longer distances (Saitou &

Nei 1987).

Conveniently, only the detection of clusters - and its association to certain

specific environments - is of interest in this study, so the methods chosen

Page 23: Molecular Biosciences

23

and described above are appropriate and sufficient for the final cluster

analysis. However, it would be preferable and more reliable to use all

available methods which take different parameters into consideration when

constructing the dendrograms.

Page 24: Molecular Biosciences

24

V. Results 50 protein sequences and 23 genomes containing the genes encoding the

protein of COG0412 were matched with dienelactone hydrolase and related

enzymes in IMG. DLH was found to be present in 8 phyla: Euryarchaeota,

Crenarchaeota, Ascomycota, Aquificae, Cyanobacteria, Actinobacteria,

Firmicutes and Proteobacteria. 13 classes and 15 orders make up all of the

proteins that were found.

The sequence from Pseudomonas KT2440 was used as a query sequence,

and the search produced hits in almost all the available microbiomes in

IMG/M. There were 90 available microbiomes. Along with the query

sequence, an additional 3 sequences representing the 3 different types of

DLH were retrieved from Pseudomonas aeruginosa using COG’s database

in order to ensure that all three types of DLH are taken into consideration

when making the phylogenetic analysis. All the retrieved sequences (65

sequences) showed enough similarity with the query sequence so as to

undergo phylogenetic analysis, according to the following: all sequences

were at least 200 amino acids long and had a minimum of 30% amino acid

identity with the query sequence. The retrieved sequences comprised 23 out

of 90 available microbiomes.

A phylogenetic analysis was executed on the retrieved environmental

sequence fragments taken from IMG, in order to see the evolutionary

relationship between them. This analysis provided information whether

phylogenetically similar DLH sequences are biogeographically close or

have a habitat in common. In order to achieve this, Clustal W was used for a

multiple sequence alignmnent, and the dendrograms were generated from

the Clustal W alignment (see Appendix). In the alignment, it was observed

that three regions were highly conserved i.e. Cys-His-Asp.

In the dendrograms, the microbiomes listed in Table 1 (as well as the query

sequence and three sequences retrieved from Pseudomonas aeruginosa)

were used.

Page 25: Molecular Biosciences

25

The results are demonstrated in the dendrograms (see Appendix). The

names of the sequences are modified in the dendrograms. Table 2 shows

which names correspond to which microbiomes.

The results demonstrate a broad biogeographical distribution. From the

dendrograms it can be seen that the environments in which DLH was

present include groundwater and soil environments, where homologs of the

query sequence were found to be abundant. Extreme environments also

contained the protein, but no homologs were found in animal microflora.

The dendrograms also show that groundwater communities do not form any

clusters distinct from the other sequences, and are therefore widespread in

the dendrogram and are consequently part of many clusters, showing close

relationship to all of the microbiomes. The dechlorinating communities

form small separate clusters in the dendrograms, whereas the hot-spring

communities form one cluster. One of the 3 types of DLH is less common

than the other 2 types.

In the alignment, it was observed that three regions were highly conserved

i.e. Cys-His-Asp. This is highlighted in the multiple sequence alignment

which is shown in the Appendix. In the alignment, histidine (His) is

highlighted in red on position 119. Asparagine (Asp) is highlighted in blue

on position 154, and cysteine (Cys) is highlighted in green on position 251.

Page 26: Molecular Biosciences

26

VI. Discussion In this study, a broad biogeographical distribution of dienelactone hydrolase

was observed. When constructing the phylogenetic trees, amino acid

sequences were taken from samples of genes from all environments in

which the protein was found (and in which the sequences had over 30%

amino acid identity with the query sequence). As indicated by the results,

homologs of DLH were found in i.e. groundwater, soil and in extreme

environments. These extreme environments include saline, dechlorinating

and thermophillic microbiomes. The sequences with the most similarity

with the amino acid sequence of Pseudomonas putida KT2440 were found

in groundwater. Since the initial search for DLH gave hits in almost all the

available microbiomes in IMG/M, this shows that the gene is abundant in

the environment, present in many microorganisms found in many different

habitats. What this instigates is that DLH is probably an intermediate in

another pre-existing pathway, because it must have taken a long time for the

gene to be acquired by so many microorganisms in such differing habitats.

Within a group of related sequences from metagenomes of saline

environments, the dienelactone hydrolase from Pseudomonas putida

KT2440 was identified. The analyses also show a close relationship with

groundwater and dechlorinating communities. This could mean that the

gene was transferred horizontally from contaminated groundwater to other

microbiomes. Moreover, in this study, DLH sequences taken from

extremophiles tend to cluster, meaning that they are phylogenetically

related; this is due to the fact that, as expected, they are only found in a

specific extreme environment, and therefore are very similar genetically and

so different from strains of other environments.

Sequences taken from water and soil samples agree with our expected

observations. The protein variant is spread out in the phylogenetic tree,

meaning that there are many types of the gene in many environments. This

makes sense, because microorganisms that live in soil and water are

exposed to conditions that are ubiquitous on earth.

Page 27: Molecular Biosciences

27

The gene was also found to be present in various other extreme-condition

environments i.e. Uranium Contaminated Groundwater FW106, Acromyrex

echinator fungus garden, Oxygen-depleted microbiome and more. This

means that the gene is active even in extreme-condition environments.

Therefore, these results clarify that the dynamics within many microbial

consortia, including extremophillic microbiomes, are in one way or another

affected by the activity of the gene.

During the course of this study, several problems arose that directly affected

the interpretation of the results. When the three different types of DLH were

slected from Pseudomonas aurigunosa, it was not known which type they

corresponded to (DLH I, DLH II orDLH III) because this information was

not provided by COGs database. However, it cannot be concluded from this

study that one type of DLH is predominant in one specific environment.

The phylogenetic trees show great genetic variation of the DLH gene. The

most probable reason for this is that the genes are in different environments,

undergoing adaptation as a result. The fact that the gene is spread out in

different habitats may be due to the fact that there are different variants that

have adapted to a specific habitat. The sequences taken from extremophillic

environments and other groups might have been derived directly from

groundwater consortia, possibly through horizontal gene transfer. In an

evolutionary perspective, this means that, at some point, DLH underwent a

minor alteration in order to retain its activity in a new environment. This

alteration might not have been sufficient so as to cause enough structural

change to alter the conformation and ultimately the function of the protein,

but sufficient to maintain its’ catalytic activity.

From this analysis, it cannot be concluded which types of DLH are found in

which type of environments. However, it is still clear that microbiomes from

certain extreme environments display specific conserved regions that are

unique to them.

Dienelactone hydrolase from Pseudomonas putida KT2440 is clustered in

such a group which ultimately shows that it also has most if not all genes

needed to survive any given environment. This is in accord with our initial

Page 28: Molecular Biosciences

28

thoughts, which were that samples of DLH from Pseudomonas strains have

proven to be of the three different types of DLH, hence the close

phylogenetic relation to variable microbiomes.

In the alignment, it is clear that three regions in the sequences are highly

conserved. DLH is known to be the only naturally occurring compound with

the catalytic triad Cys-His-Asp, so this triad was expected to be conserved

throughout the entire alignment; this was the case. Apart from this, it was

difficult to detect which additional regions (unique to the sequences within

that cluster) were conserved among the hot spring/boiling water sequences.

The results show that DLH could be used in developing remediation

techniques because of its widespread biogeographical distribution. If we can

detect the extent of activity of DLH within a microbiome, we could further

investigate the interactions between the microorganisms in that habitat. This

could help in bringing forth appropriate mix-cultures for bioremediation, fit

to clean up any contaminated site. It is therefore imperative that we search

for answers to what induces the activation of these genes. If methods can be

developed that could determine the activity of catabolic genes in a given

environment, it would facilitate in the detection of microbial consortia

capable of degrading recalcitrant compounds. This would enable us to

access these microbiomes directly instead of first looking for the gene

before determining its activity.

Dienelactone hydrolase can be used in the biotechnological field of

bioremediation by developing enzyme-assays that enable the detection of

the activity of this gene. In doing so, more knowledge can be gained about

the conditions of the environment in which the gene is active, thereby

assisting scientists as they attempt to develop novel methods.

Contaminated sites have proved to be a direct threat to human populations.

For example, the contamination of groundwater is hard to detect due to the

fact that it is a hidden resource. This makes it simple to forget that it is a

serious problem. According to Morris (et al., 2003) approximately two

billion people depend on aquifers for drinking water, and 40% of the world's

food is produced by irrigated agriculture that relies largely on groundwater

Page 29: Molecular Biosciences

29

(Morris et al., 2003). Despite the need for effective methods for

bioremediation, such methods are still under development.

It would not be absurd to assume and expect a correlation between

contaminated sites and clusters of certain genetic variants of DLH. Studies

show the presence of thermophiles in oil reservoirs (Fardeau et al., 2004).

This is an important finding in relation to this study, because this indicates

that the DLH variant present in our thermophile-cluster could be utilized in

bioremediation in extreme environments.

A novel option in the development of bioremediation techniques could be to

consider GMO (Genetically Modified Organisms). In relation to this study,

GMO could be an effective option. Genetically-modified strains or consortia

(with the desired DLH catalytic capabilities) could be developed as

approaches that overcome the limiting factors of contaminant

biodegradation. These strains or consortia could then be inserted directly

into contaminated groundwater (along with nutrients if necessary), or

Permeable Reactive Biobarriers (PRB) could be developed. However,

genetic modification of microorganisms is controversial for the simple

reason that little is known about whether we can control the fate of such

organisms. Also, since the genes from groundwater were spread out in the

dendrograms, DLH could be used in relatively cheap bioremediation

methods such as In situ bioremediation.

Since the studies by Dybas (et al., 2002) mentioned previously have proved

bioaugmentation to be a successful technique in bioremediation of

recalcitrant compounds, it could be used when there are few or poorly

adapted natural populations, and when intrinsic bioremediation does not

work. This means that bioaugmentation would be a relevant method to use if

bioremediation-techniques involving DLH were to be developed.

Several other methods would be feasible, each having advantages and

disadvantages depending on the specific bioremediation sites. According to

Philp (et al., 2009) some compounds are more rapidly degraded

anaerobically. This is especially true for the dechlorination of halocarbons.

Page 30: Molecular Biosciences

30

This suggests that sequential anaerobic and aerobic treatment would be the

best alternative for the bioremediation of highly chlorinated compounds.

Also, since the studies performed by Schreiber (et al., 1980) and Schlömann

(1994) indicated that DLH was derived from a preexisting pathway, and that

the results of this study reveal that DLH has a widespread biogeographical

distribution, it can be concluded that the gene is present in a great number of

microorganisms adapted to a variety of environments. The gene has

therefore had a long period of time to be passed on to other microorganisms

through horizontal gene transfer. The more time passes, the more probable it

is for a variety of microorganisms to acquire the gene. Hence, instead of

turning to controversial methods such as i.e. genetic modification, it should

be possible to extract adapted strains from a variety of habitats and grow

them in the laboratory. Thereafter, mixed inoculums could be produced (e.g.

adapted-strain/fertilizer inoculums) and used for the bioremediation of

certain contaminated sites, as suggested by Raghavan & Vivekanandan

(1999).

Page 31: Molecular Biosciences

31

VII. Concluding remarks Dienelactone hydrolase was found to be present in many environments,

ranging from soil and groundwater environments to extreme environments.

The evolutionary relationship between different strains was easy to see.

Groundwater and soil samples were scattered across the entire dendrogram,

whilst extremophiles exhibited a closer relationship, given by the clusters.

Therefore, there seems to be an adaptation in the extremophiles that are

exclusive to their extreme environments.

This observation is based on the rooted dendrogram, which displays two

distinct clusters that seem to comprise samples that share a very close

genetic relationship, hence the absence of sequences retrieved from other

environments within these clusters. These two distinct clusters are the

dechlorinating and hot spring/boiling water microbiomes. The difference

between the two, well-conserved clusters is that the genes retrieved from hot

spring/boiling water environments are only found in a certain (rare) type of

geographical area, whereas the dechlorinating communities are

geographically more widespread.

Our expected observations were confirmed, because microorganisms in soil

and groundwater should be found in any given environment, due to their

abundance. Their abundance correlates with the fact that the conditions

present in these environments are copious on our planet.

From the results of this study it can be concluded that DLH can definitely be

useful in bioremediation in the future, but further research on the matter is

warranted. Unquestionably, the best results will draw from bioremediation

techniques that involve a combination of pre-assessed effective methods

custom made for the environment under decontamination, which will have

to circumvent obstacles such as limiting factors and other parameters.

Enzyme-assays will have to be developed in order to detect the activity of

catabolic genes in a given environment.

Page 32: Molecular Biosciences

32

VIII. Acknowledgments I would foremost like to thank Associate Professor Sara Sjöling & Karin

Hjort for supervising and guiding me through the process. Second, I want to

thank the Department of Life Sciences at Södertörn University for making

this thesis possible. Last but not least, I would like to thank my examiner for

reviewing and assessing this report.

I am blessed to be expecting a daughter in August, so, any work that I do

from now on is strictly dedicated to her, and my family. Mutatis Mutandis.

Page 33: Molecular Biosciences

33

IX. Appendix

Fig. 1. Illustration of the modified ortho-cleavage pathway with related enzymes. TCC = Tricarboxylic acid cycle. Degradative pathways for 3-chlorocatechol, 4-chlorocatechol and 3,5-dichlorocatechol in Rhodococcus opacus 1CP. (Moiseeva et. al., 2002)

Page 34: Molecular Biosciences

34

Fig 2. 3D crystal structure of dienelactone hydrolase (DLH) from Pseudomonas putida

KT2440 retrieved from the Protein Data Bank (PDB). The coloured molecules are glycerol

and sulphate ions. Image of 1ZI6 (Following directed evolution with crystallography:

structural changes observed in changing the substrate specificity of dienelactone hydrolase.

(2005) Acta Crystallogr.,Sect.D61: 920-931, Kim, H.K., Liu, J.W., Carr, P.D., Ollis, D.L.)

created with J.mol (J.L. Moreland, A. Gramada, O.V. Buzko, Q. Zhang, P.E. Bourne (2005)

The Molecular Biology Toolkit (MBT): a modular platform for developing molecular

visualization applications. BMC Bioinformatics 6:21).

Page 35: Molecular Biosciences

35

Fig. 3. Flow diagram of a typical metagenome project. Dashed arrows indicate steps

that can be omitted. (Thomas et. al., 2012)

Page 36: Molecular Biosciences

36

-1 sequence of microbiome retrieved from ANAS dechlorinating bioreactor

-2 sequences of microbiome from Acid mine drainage

-1 sequences of microbiome from Acromyrmex echinator fungus garden

-2 sequences of microbiome from Air microbial communities Singapore

indoor air filters

-2 sequences of microbiome from Aquatic dechlorinating community (KB-

1)

-1 sequence of microbiome from Bath Hot Springs, filamentous community

-1 sequence of microbiome from Bath Hot Springs, planktonic community

-5 sequences of microbiome from Bison Hot Spring Pool, Yellowstone

-6 sequences of microbiome from Fossil microbial community

-1 sequence of microbiome from Marine micr. comm.. from Deepwater

Horizon Oil Spill

-1 sequence of microbiome from Hot Spring microbial communities from

Yellowstone National Park

-18 sequences of microbiome from Oak Ridge Pristine Groundwater

-1 sequence of microbiome from Saline water microbial communities from

Great Salt Lake

-1 sequence of microbiome from Sediment and water microbial

communities from Great Boiling Spring

-2 sequences of microbiome from Uranium Contaminated Groundwater

FW106

-1 sequence of microbiome from Marine planktonic communities from

Hawaii Ocean Times Series Station (oxygen minimum layer)

-1 sequence of microbiome from Bacterial pyrene-degrading mixed culture

-2 sequences of microbiome from Groundwater dechlorinating community

(KB-1) from synthetic mineral medium in Toronto, ON, sample from site

contaminated with chlorinated ethenes

-2 sequences of microbiome from Hypersaline Mat

-2 sequences of microbiome from Marine Trichodesmium cyanobacterial

communities the Bermuda Atlantic Time-Series

-2 sequences of microbiome from Lake Washington Formaldehyde

enrichment

-1 sequence of microbiome from PCE-dechlorinating mixed culture

Page 37: Molecular Biosciences

37

-2 sequences of microbiome from Soil microbial communities from Puerto

Rico rain forest, that decompose switchgrass

-3 sequences of microbiome from Thermal compost enrichment from

Puerto Rico rainforest.

Table 1. The table shows the number of sequences taken from the microbiomes used in the

dendrograms.

Page 38: Molecular Biosciences

38

NAME OF MICROBIOME (IMG/M) MODIFIED NAME

(Dendrogram)

ANAS dechlorinating bioreactor ANAS

Acid mine drainage Acid

Acromyrmex echinator fungus garden Acromyrmex

Air microbial communities Singapore indoor air

filters

Air

Aquatic dechlorinating community (KB-1) Aquatic

Bath Hot Springs, (filamentous & planktonic

communities)

BathHotS

Bison Hot spring pool Yellowstone Bison

Fossil microbial community

Fossil

Page 39: Molecular Biosciences

39

Marine microbial community from

Deepwater Horizon Oil Spill

Hot Spring microbial communities

from Yellowstone National Park

Oak Ridge Pristine Groundwater

Saline water microbial communities

from Great Salt Lake

Sediment and water microbial

communities from Great Boiling

Spring

Uranium Contaminated Groundwater

FW106

Marine planktonic communities from

Hawaii Ocean Times Series Station

(oxygen minimum layer)

Bacterial pyrene-degrading mixed

culture

Groundwater dechlorinating

community (KB-1) from synthetic

mineral medium in Toronto, ON,

sample from site contaminated with

chlorinated ethenes

Hypersaline Mat

Oil

Hot

Oak

Saline

Sediment_water_boiling

Uranium

O2-minimun

Pyrene-degrading

Groundwater-

dechlorinating

Hypersaline

Page 40: Molecular Biosciences

40

Marine Trichodesmium cyanobacterial

communities the Bermuda Atlantic

Time-Series

Lake Washington Formaldehyde

enrichment

PCE-dechlorinating mixed culture

Soil microbial communities from

Puerto Rico rain forest, that decompose

switchgrass

Thermal compost enrichment from

Puerto Rico rainforest

Cyanobacterial-

community

Formaldehyde-enrichment

PCE-dechlorinating

Rainforest-soil

Rainforest-thermophile

Table 2. Modified names (inserted in dendrogram) and corresponding names as in IGM/M

Page 41: Molecular Biosciences

41

Multiple Sequence Alignment

Fossil4 --------------------------------------------------

Oak9 --------------------------------------------------

Air1 --------------------------------------------------

Fossil1 --------------------------------------------------

Cyanobacterial-community --------------------------------------------------

PA2682 --------------------------------------------------

Uranium1 --------------------------------------------------

Rainforest-Thermophiles1 --------------------------------------------------

Acid1 ---------------------------------------------MKKTF

Acid2 --------------------------------------------------

Hypersaline --------------------------------------------------

Cyanobacterial-community2 --------------------------------------------------

O2-minimum --------------------------------------------------

Bison5 -------------------PKGMHMKDLVQDGDSLVAKTAFEDGVDRRVF

Uranium2 --------------------------------------------------

Acromyrmex --------------------------------------------------

Formaldehyde-enrichment2 ------------------------------SQDDHFNSLVPETPIDRRGF

Aquatic_dechlorinating_ --------------------------------------------------

Groundwater-dechlorinating --------------------------------------------------

ANAS --------------------------------------------------

Oak11 -----------------------------------------------MSI

Oak1 --------------------------------------------------

Pseudomonas --------------------------------------------------

PA1166 --------------------------------------------------

Hypersaline2 --------------------------------------------------

Oak14 --------------------------------------------------

PA1597 --------------------------------------------------

Oak18 --------------------------------------------------

Saline1 -----------------------MRVAGFVLILCTLPLLAGCGSDSGSEA

Oak3 --------------------------------------------------

Bison2 --------------------------------------------------

Bison3 --------------------------------------------------

BathHotS1 --------------------------------------------------

BathHotS2 --------------------------------------------------

Bison1 --------------------------------------------------

Bison4 --------------------------------------------------

Sediment_Water_Boiling --------------------------------------------------

Rainforest-Thermophiles2 --------------------------------------------------

Oak17 --------------------------------------------------

Pyrene-degrading --------------------------------------------------

Fossil2 --------------------------------------------------

Rainforest-Thermophiles MADIKKEDIK------QEVFDLYDDYAHNRIDRREFVQKLSLYAVGGLTV

Fossil5 MTPIRKRASD----FHPHILEIFDGYVHGAISKRDFIKQAGKFAAAGVTG

Oak12 MT--RLTAKD----FAPELLELYDGYAHGKINRREFLDRAALFTLGGLTA

Marine_Oil MLMNNQQEENNGHLIPLEAFNWYDEYAHGLIDRRTFIARLSMLVTATLTL

Oak10 ----------------------------------------KRLALTGVAL

Oak7 --------------MDQKFITLFDRFTHGGMNRRTFMEKLTILAGSATAA

Oak4 ---------------------------------------------MKRIG

Oak5 ---------------------------------------------MKRIG

Fossil6 --------------------------------------------------

Fossil3 --------------------------------------------------

Oak2 --------------------------------------------------

Oak13 ------------------------------MCDQDHFDVDKLEFETKGLV

Oak15 -----------------------------------------GLAKSARFR

Aquatic_dechlorinating_2 --------------------------------------------------

Groundwater-dechlorinating1 --------------------------------------------------

Page 42: Molecular Biosciences

42

Hot --------------------------------------------------

Formaldehyde-enrichment --------------------------------------------------

Rainforest-Soil --------------------------------------------------

Rainforest-Soil2 --------------------------------------------------

PCE-dechlorinating --------------------------------------------------

Oak6 ---------------------------------MTKGLSKGCQKDFPSGA

Oak16 --------------------------------------------------

Air2 --------------------------------------------------

Oak8 --------------------------------------------------

Saline2 --------------------------------------------------

Fossil4 -----------------------LRAPWWPGHRNRIDAGISRLPSPAAFG

Oak9 -----------------------IRVPVTGG---EMSAYMS-LPK---KG

Air1 -----------------------MSATNTTIPALDSEGEIPAYVARPDAD

Fossil1 --------------------------MTNSLTVTTPDGKFDAYVAMPAKL

Cyanobacterial-community --------------------------MKITISSN-YDETFTADLKIPTST

PA2682 ------------------------MGQYVSIAASDGSGRFDAYLALPASG

Uranium1 -------------------------MGQQINIPTSGTQCIGAYMAQASGK

Rainforest-Thermophiles1 -------------------------MGQWTELETP-AGSVAAWQADPPGT

Acid1 YIDVEQGGILMTEKQQMPEFFDRSAISGIAAEECRITSSLGGAFARPKGS

Acid2 ----------------MGEWIER----GLAFKEF------------PAGE

Hypersaline ---------------------------------------------MPVGE

Cyanobacterial-community2 --------------------MTQLKIHTTHIQVPNGDLQIDSYLAQPLEA

O2-minimum -------------------------------------------LRGTTTG

Bison5 LKAAVGSGFAAATLPVMAQSMIQTDTSGLSAGDHIIVINGQDVPVYRAQP

Uranium2 --------------------------------------------MSRA--

Acromyrmex ----------------------------VSAPFPVILVVHEIFGINDY--

Formaldehyde-enrichment2 IAAALAAGFAVTAGPVLG-QAIKTPMDGLEGGDISIGDIPAYYAVPKAG-

Aquatic_dechlorinating_ -----------MAQTKKMHS--------ETVNYKDGETELQGYLVYDENL

Groundwater-dechlorinating -----------MAQTKKMHS--------ETVNYKDGETELQGYLVYDENL

ANAS MKMLVAGLLFCVAMVFPMASGAGAAVRMETYPYGKGEVRLLGQLAWDDAV

Oak11 MTHRGSAILFWLVLIGCLPS-AQAAIQGQAVEYRDGDTVLEGYVAYDDAH

Oak1 --------------------------------------------------

Pseudomonas ---MNMRALLALTLMCSAALAQAAVVTREIPYQDDDGNRLVGYYAYDDAL

PA1166 -----MRLLCALLLIACAASVQAAIQTREMPYRSADGTRMVGYFAYDDSK

Hypersaline2 -----------------------------ISYED-EGVPLTGHLYWDDAI

Oak14 ------------------------------VVYQIDGQSYESRLAFDASH

PA1597 ----------------------MSEIRVEPVAYDIDGQPYEGQLVYDASH

Oak18 --------------------------------------------------

Saline1 ERMAEEHEGDTPTATEAAQAPKIPVEGRTVTYGQQNGTARTGYLAAPADV

Oak3 -----MKRIAIFLALLAFAASAFAAEGRTVTYKSGNDTISAVLYAPAKTM

Bison2 --------------------------MGQRISFNVNGVEVSGYLAEPENL

Bison3 --------------------------MGQRISFSVNGVEVSGYLAEPENL

BathHotS1 --------------------------MGQRISFNVNGVEVSGYLAEPENL

BathHotS2 --------------------------MGQRISFNVNGVEVSGYLAEPENL

Bison1 --------------------------MGQRISFNVNGVEVSGYLAEPENM

Bison4 --------------------------MGQRISFSVNGVEVSGYLAEPENM

Sediment_Water_Boiling --------------------------MGQRISFSINGIEVSGYLAEPENM

Rainforest-Thermophiles2 -------------------------MKTETLQFETANGATTAYVAMPDNA

Oak17 ---------------------------------ASNGEQARGYLSLPSGG

Pyrene-degrading -------------------------MGHLLDFKRPDNTNCRGYLAT-AGQ

Fossil2 -MLLSRLSPDYGMPEQVSFNDPDILASYEKYDSPNGNGEIEGYLVKPTAA

Rainforest-Thermophiles SSLMSFLMPDYKNRTQVKADDPRIQAEYITYASPKGAGTMKGLLCMPSDV

Fossil5 AMILDQLQPNYAWAAQVEPDDPSILSERISYDSPEGHGKIIGLMAKPVGA

Oak12 SALLAALSPNYALAEQVKFTDPDIVADYITYPSPKGNGTVRGYLVRPAKA

Marine_Oil SVLTSALIPNYAKAEQVSFNDQDIIAKYSTFSSPEGHGEGRGYLVLPAYI

Oak10 TAITEGLMPNYALGQQVRKDDERIKATYETVQSPMGNGSIKGYFVRPTSA

Oak7 NALLPLLENNYARADILPEGDPRIVSQTLEYKG--GAG----YFVKPSAE

Oak4 LVLVLLIAPGISAQDWARVKLEKSPRHREWVTVKHEGRAVETFVAYPESK

Page 43: Molecular Biosciences

43

Oak5 LVLVLLIAPGISAQDWARVKLEKSPRHREWVTVKHEGRAVETFVAYPESK

Fossil6 --------------------------------------TMHNFVAYPERS

Fossil3 ---------------------------------------MSTYHVAP-AE

Oak2 -------------------------MIEKEVRVTSRNGVIPSFAVCPEGP

Oak13 TRRQFGVLLGAGMAMLLPRVVNAVAVTDGEVTITTPDGTCDAYFVHP-AS

Oak15 DDEAFIELLADVDAVIGAAAQPTPRVTTTKIEIATADGKCPSYVFRPEGT

Aquatic_dechlorinating_2 ---TSREPAQEPAVGDEPSPPFPYTAEDVDFGDARAGIRLAGTLTVPEGK

Groundwater-dechlorinating1 ---TSREPAQEPAVGDEPSPPFPYTAEDVDFGDARAGIRLAGTLTVPEGK

Hot -------------------------MEEKVRYKSFDGKEVEAFLVKGGDK

Formaldehyde-enrichment ----------------------MQNPTSEQIEISTADGLMPAVLAHPVVA

Rainforest-Soil ---------------------------MSIQCETVRYGDQVAFFAAPERH

Rainforest-Soil2 ---------------------------------------MPAWLAVPKSS

PCE-dechlorinating ----------------------QAFEEREVVLNAGTDWELPGTLALPVKR

Oak6 NSPILLRTQKPAIDPIHKSEPLMTIKDNEIVEVPTPTGPMRTYVFRPTAE

Oak16 ---------------------------------------VRRLADQEQAA

Air2 --------------------------MGEWITLDTHYGPVRAWQATPEGK

Oak8 MTNRIAGLPGKSKGDRVGAEQRLLPAPGRNRRGRVGEGDADHTLLRDGEQ

Saline2 -----MPGPARTGPLHPEARSRTDRRRFLASAGALGTALLAGCLGDDTES

Fossil4 NR----------PGVLVLQ--------EISGSRLDGRHP-DWLAGEGFTA

Oak9 KG----------PGIVVLQ--------EIFGVNESMRKVCDFLASRQFTA

Air1 SP----------RAIIVIP--------EIFGVNAGIRKKCDDWAAEGYLV

Fossil1 PA----------PVVVVIQ--------EIFGVNPVMRGIADDYAKQGYIA

Cyanobacterial-community PA----------PGLIIIQ--------EIFGVNEVMRNIADRYAQLGYVA

PA2682 KG----------PGVVIGQ--------EIFGVNANMRAVADLYAEEGYVA

Uranium1 PK----------GGLLVIQ--------EIYGVNAHMRSVVDRFARLGYTA

Rainforest-Thermophiles1 PR----------GGLVVIQ--------EIFGVNPHIRAVADGYAAEGYVV

Acid1 G---------PHPAMIVFM--------EAFGLNGFIKDFLRLLAAEGFFA

Acid2 K---------KVPGIILLM--------EAYGVNEHFRRLAARLAGWGYAV

Hypersaline E---------SLPGVVVLQ--------EIFGVNDHIRDVTQRIAQEGYVA

Cyanobacterial-community2 G---------LFPAVVVFQ--------EIFGVNNHIREVTENIAKEGYVA

O2-minimum S---------PGPAGLVIM--------EAFGVDAHIMDVARRLATEGYVT

Bison5 EGR------SNLPVVLVIS--------EIFGVHEHIKDVARRFAKAGYLA

Uranium2 -----------------------------------------RFAKQGYLA

Acromyrmex -----------------------------------IRDICRRLAEAGYLA

Formaldehyde-enrichment2 ---------GRRPVLLVVT--------EIWGLHEYIKDTCRRLAKAGYFA

Aquatic_dechlorinating_ TS--------PAPGVLVVH--------EWMGLNDYAKHRADMLAELGYVA

Groundwater-dechlorinating TS--------PAPGVLVVH--------EWMGLNDYAKHRADMLAELGYVA

ANAS KG--------PRPAVLVVH--------EWWGLNNYARERASALASMGYIA

Oak11 TQ--------PRPGVLVVH--------EWKGLNEYAKRRARQLAELGYIA

Oak1 --------------VLVCH--------EGSGLDRHAKGRAERLAGLGYAA

Pseudomonas DG--------KRPGIVVVH--------EWWGLNDYAKRRARDLAALGYKA

PA1166 PG--------IRPGVIVVH--------EWWGLNDYAKRRARDLAELGYSA

Hypersaline2 AD--------ERPGILVIH--------EWWGLNDYAKKRARMLAELGYVA

Oak14 KG--------PLPGLLMAP--------NWRGVSAGAEEIAKRVAAKGYVV

PA1597 AG--------PRPGLLMAP--------NWMGVSAAALDIARQVAGRGHVV

Oak18 -----------LPGLVVIH--------EWWGLNDDIKAVTRRLAAEGYVA

Saline1 DSVRSARGGDALPGIVVIH--------EWWGLNDNVRAATRRLAGEGYRA

Oak3 KG--------KLPAIVIIH--------EWWGLNDWVQEQASKWADQGYVT

Bison2 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA

Bison3 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA

BathHotS1 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA

BathHotS2 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA

Bison1 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA

Bison4 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFCA

Sediment_Water_Boiling QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA

Rainforest-Thermophiles2 DA--------SKAVILI-H--------EWWGLNDHIKDIANRYAAEGFIA

Oak17 TG----------PGVIVVQ--------EWWGLNPQIKGVADRLASEGFVA

Pyrene-degrading DR----------PGVVVIQ--------EWWGLNDQICGVADRFARAGFNA

Page 44: Molecular Biosciences

44

Fossil2 TG--------KIPAVLVIH--------ENRGLNPYVKDVAHRVAKAGFLA

Rainforest-Thermophiles KE--------KLAGVIVVH--------ENRGLNPYIEDVARRTALAGFVA

Fossil5 TG--------KLPAVLVIH--------ENRGLNPYIEDVARRMAKAGYLA

Oak12 AG--------KLPAVVVVH--------ENRGLNPYIEDVARRLAKAGFIA

Marine_Oil AN--------KAPVVLVVH--------ENRGLNPYIKDVARRLAKAGFIA

Oak10 DSREAMPT--KLPGVIVVH--------ENRGLNPHIEDVARRFAMENFMA

Oak7 G---------KYPGVIVIH--------ENRGLNPHIKDVARRLAVEGFAV

Oak4 DK---------TPVVLVIH--------EIFGMTDWVEDLADQVAEAGYIA

Oak5 DK---------TPVVLVIH--------EIFGMTDWVEDLADQVAEAGYIA

Fossil6 DK---------APVFIVIH--------ENRGLNEWARSFTDQLAKKGFIA

Fossil3 GKY---------PPVILYM--------DAPGIREELRDFARRIAAQGYFV

Oak2 GAF---------PGIILYM--------DAPGIREELRNLARRIAKHGYFC

Oak13 GTA---------PGVLLWP--------DIFGRRPAMHQMAKRLAESGYSV

Oak15 GPF---------PAVLVFM--------DGLGIRPAILEIGERLSTYGYFV

Aquatic_dechlorinating_2 RP---------FPGVVLVS--------G----SGPQNRDEEILGHRPFLV

Groundwater-dechlorinating1 RP---------FPGVVLVS--------G----SGPQNRDEEILGHRPFLV

Hot AG------------IIVIS--------EIWGLTQFIQNVARRLASLGYTS

Formaldehyde-enrichment RG-----------AVVVVQ--------EAFGVTSHLESICHRLASVGWLA

Rainforest-Soil RES--------LPAVIVVQ--------EVFGLGGHIEDVARRIAAAGYAA

Rainforest-Soil2 TP---------VPGVVVLH--------DVFGMSRDHRNQANWLADAGFLA

PCE-dechlorinating GAS---------PAVVLVHGSGANDRDETIGPNKPFRDLAWGLASQGIVV

Oak6 GR---------YPGILLYS--------EIFQVTGPIRRTAAMLAGHGFVV

Oak16 QD---------QDQVLAAD-----------GVAEDREQRVGQAHHPGDRQ

Air2 PR----------GGLVVIQ--------EIFGANPHIRAVADDYAAKGYAV

Oak8 MDT-------EHADMRAVGDASEADTVAACALDYFSDGPCGRLVRQAVAV

Saline2 TDD--------DGGTDDGGDGG-----TDGGGDDNGGTDDGGDDGDGIDD

Fossil4 LCPDLFWRIEPGIQITDK--------------------TEAELNRAFELF

Oak9 VCPDLFWRAEPGVELK-----------------------ETEFERARALR

Air1 IAPDIFWRFAPGVELNPD--------------------VEAELQEAFGYF

Fossil1 VCPDLFWRIEPGINITDQ--------------------TEEEWKQAFGYY

Cyanobacterial-community IIPDLFWRQEPGIELSAQ--------------------SEDDWKKAFELY

PA2682 LVPDLFWRLQPGVDLG-Y--------------------DEAAFAKAIELF

Uranium1 IAPAFFDHLETGVELD-Y--------------------DRAGTHKGKQLV

Rainforest-Thermophiles1 LAPAFFDPVERGVELG-Y--------------------GEDGFARGRALV

Acid1 VAPDLY-----EGKIYEY--------------------SDFSGAIG--HL

Acid2 LVPDLYRRFPEERRVVAY--------------------SDRETAMG--NL

Hypersaline IAPALYQR-VAPGFETGY--------------------TEADLKIGKEYK

Cyanobacterial-community2 IAPSIYQR-QAPGFEVGY--------------------TEEDIILGRKYK

O2-minimum AAPDLFHR-GGR-LASAP--------------------YDKLAEYRDQLR

Bison5 IAPDLFVR----QGDPTK--------------------IANIADLMKDII

Uranium2 LAPELFVR----QGDAHN--------------------ASSIADLMTNIV

Acromyrmex IAPDLFFR----HSDPAS--------------------FSSPQQLKNELV

Formaldehyde-enrichment2 VANDPYYR----LGELWK--------------------LTQIKEVLAKAN

Aquatic_dechlorinating_ FAVDIYGVNNLPNDM-------------------------QGAAAMAGKF

Groundwater-dechlorinating FAVDIYGVNNLPNDM-------------------------QGAAAMAGKF

ANAS LAADIYGEGFATTDP-------------------------SKARELAGKF

Oak11 FAADMYGKGVLAKDH-------------------------DEAAKLSGVF

Oak1 FALDYHGDG-KPLGR-------------------------DEMMDRLGQL

Pseudomonas LAIDMYGDGKHTEHP-------------------------QDAQAFMAAA

PA1166 LAIDMYGEGKHTEHP-------------------------QDAMAFMQAA

Hypersaline2 FAADMYGNDQVTDQP-------------------------SQAREWMQEV

Oak14 LIADLYGQKVRPSNG-------------------------DEAGAAMMPL

PA1597 LVADLYGRDVRPQNG-------------------------DEAGAAMMPL

Oak18 LAVDLYGGKTAATP---------------------------DAAEALTND

Saline1 LAVDLYGGAVAETP---------------------------DSAQALMGQ

Oak3 LAVDLYRGKVATDR---------------------------DMAHELMRG

Bison2 FAIDLYKGKTADNP--------------------------EDAGKLMMDL

Bison3 FAIDLYKGKTADNP--------------------------EDAGKLMMDL

Page 45: Molecular Biosciences

45

BathHotS1 FAIDLYKGKTADNP--------------------------EDAGKLMMDL

BathHotS2 FAIDLYKGKTADNP--------------------------EDAGKLMMDL

Bison1 FAIDLYKGKTADNP--------------------------EDAGKLMMDL

Bison4 FAIDLYKGKTADNP--------------------------EDAGKLMMDL

Sediment_Water_Boiling FAIDLYKGKTADNP--------------------------EDAGKLMMDL

Rainforest-Thermophiles2 IAPDLYRGTIATDP--------------------------QEASKLMHGL

Oak17 LAPDLYRGELAGHDE-------------------------MDRAGELMSK

Pyrene-degrading LAPDLYHGRIT--QD-------------------------ANEASHMMNG

Fossil2 FAPDGLSSVGGYPG---------------------------NDAEGKALQ

Rainforest-Thermophiles LAPDALTPLGGYPG---------------------------NDDEGRALQ

Fossil5 LAPDGLSPLGGYPG---------------------------NDDEGRTMQ

Oak12 LAPDGLTSVGGYPG---------------------------NDEKGVELQ

Marine_Oil FAPDILHTLGGYPG---------------------------NDDEGRKMQ

Oak10 FAPDGLTSVGGFPG---------------------------NDFQGGQLF

Oak7 LAPDYLSGLGGTPE---------------------------DADKARDMI

Oak4 VAPDLLSGMGPNGGRSSD--------------------F-AQG-KTMEAV

Oak5 VAPDLLSGMGPNGGRSSD--------------------F-AQG-KTMEAV

Fossil6 VAPDLISNTVEGFEKTTD--------------------F-ENSDAARSAI

Fossil3 LLPDMYYRQGELRFDLSKG---------------KEE----MKR-MFGAM

Oak2 LLPDMYYRLGQLRFDFVRR---------------AEG----MRATMFAAM

Oak13 LVVNPFYRVKKAPTADAGA---------------ATP----IQQLMP-LA

Oak15 LLPDLYYRFGPYAPMDARA---------------IFTDPEKIKELRERFF

Aquatic_dechlorinating_2 LADYLTRRGIAVLRYDDR-----------------------GVGDSKGSF

Groundwater-dechlorinating1 LADYLTRRGIAVLRYDDR-----------------------GVGDSKGSF

Hot LAPNLYSREGDLFSPENISGVMRRFFSIPPEKRGDQEFISRIVAELNERE

Formaldehyde-enrichment VAPALYHRQGSPVFAY------------------------DDLAGVMPVI

Rainforest-Soil IAPDLYAVDGVRPPHMTSARIERAFGVTRSLAPELAEDPVAKAAALARLP

Rainforest-Soil2 LAPDLYYHGGRLICIR-----------------------------HVIRD

PCE-dechlorinating LRYDKRTR--------------------------------VHASQMAGLR

Oak6 VAPEIYHEFEPAGTVLAYD--------------------QAGADRGNALK

Oak16 QQPDARAHRQAQADLP--------------------------GPRPLVLG

Air2 LAPSFFDLAESAPDS--------------------------TEPPELPYD

Oak8 VDQAHSGAVREHLRP-------------------------RGAVGAAVLQ

Saline2 DPVTLERRAREYMQLQGEG---------------------SFEAAFERFA

Fossil4 GLFNQDTGLQDIRTSLSALRAL-------------DACS---GKAGAIGY

Oak9 GKMNDDQVTDDIASAIAFLRKH-------------PACD---GTVGVVGY

Air1 GQYDADDGVKDIEAAIRWLHAQ-------------GA-----GKVGAVGF

Fossil1 QAFNVDKGVEDIAATMAQARKI-------------DGAN---GKVGVVGF

Cyanobacterial-community QGFDENKGVDDLISTMKTLKKL-------------PECS---GQVGTIGF

PA2682 QRIDLDAAVDDIAACIEHLRQR-------------EEVVH--AGIGFVGF

Uranium1 TELGLERALEDVASAAEAIAS---------------AGR-----IGTVGY

Rainforest-Thermophiles1 QELGTARALAILQAAAERLRAD-------------LAARQAPTAVGTVGY

Acid1 SRLKDDVVMEQTRQTLEWLE---------------KRPDVQKDRTGALGF

Acid2 SRLKDEEAKEDISRCLDILR---------------NDPRVDRDRIGVVGF

Hypersaline AQTKAEELLGDIQGAIDYLR---------------EQTPVKSNAIGCIGF

Cyanobacterial-community2 EQTKASELLGDIQATINYLK---------------TLPTVKSDKFGCIGF

O2-minimum VGFSDQTVLSDVEAAVRQLQ---------------IDPGVKG-PIGIVGF

Bison5 SKTPDAQVMSDLDTVVNWAR---------------QRG-GDIERLGITGF

Uranium2 AKVPDAQVMGDLDACVAWAR---------------ENG-GDTSRLGITGF

Acromyrmex GKVADREVLADLDHAANWAA---------------THG-GDLRRLGLTGF

Formaldehyde-enrichment2 S-LADEQAFSDLDAVVAWAG---------------THKRANVARLGITGF

Aquatic_dechlorinating_ KS-DRKLMRQRISLGLDELK---------------KQPNVNVNKIAAIGY

Groundwater-dechlorinating KS-DRKLMRQRISLGLDELK---------------KQPNVNVNKIAAIGY

ANAS RAGDRALLRERVNSALAALK---------------THPLADKGRVAAIGY

Oak11 RN-DRQLMRRRAKAGLEALS---------------KHPLTDPSRLAAIGY

Oak1 MG-DPDRIRAIGRAGLDVLL---------------AQPEVDPGRVAAIGY

Pseudomonas MK-DPAAAAARFDAGLELLK---------------KQPNVNKHQLGAVGY

PA1166 TR-DADAAKARFLAGLELLK---------------RQPQTDPSQIAAIGY

Page 46: Molecular Biosciences

46

Hypersaline2 TV-DPELWRQRADAGLAQLK---------------AAADVDDAQIAAIGY

Oak14 KN-DRPLLNKRMQAALEQLQGQ-------------AEAAVDTSKLATFGF

PA1597 KN-DRALLRKRMQAALAALRGQ-------------ALAAVDTTRQAAFGF

Oak18 VYADPDGTRRNLQQAYDYLE---------------KYAFAP--RIATIGW

Saline1 AMREPSRLVENVRDGRAYLS---------------SEADAP--RTALLGW

Oak3 LDQE--RAVADMRAGIVYLK---------------SLPNVDGARIGSIGW

Bison2 MQNRLQEAENMVRASLEYFKKENI----------GYVPRVGEFMFGATGY

Bison3 MQNRLQEAENMVRASLEYFKKENI----------GYVPRVGEFMFGATGY

BathHotS1 MQNRLQEAESMVRASLEYFKKENI----------GYVPRIGEFMFGATGY

BathHotS2 MQNRLQEAENMVRASLEYFKKEDI----------GYVPRVGEFMFGATGY

Bison1 MQNRLQEAEDMVRAALEYFKKNDI----------GYVPRVGEFMCGAIGY

Bison4 MQNRLQEAEDMVRAALEYFRKNDI----------GYVPRVGEFMCGATGY

Sediment_Water_Boiling MQNRLDEAEKMVQASLEYFKRENI----------GYVPRVGEFMFGATGY

Rainforest-Thermophiles2 A---IEDGLDTIKNAMDAAR-----------------AKYGITHFGITGY

Oak17 MP-MERAAR-DMSGAVDFLAAH---------------PAVTGAGIGAIGF

Pyrene-degrading LD-FPGATHQDIHGAVTHLQR-------------------ISSQVGVMGF

Fossil2 ATVDGTKLMNDFFAGFEHLM----------------GHEASTGKVGAVGF

Rainforest-Thermophiles AKRNREEILEDFIAGYDYLK----------------KHPKCTGRIGVVGF

Fossil5 RTLDGAKLMEDFFAAFEFLR----------------DHDGSTGKVGAVGF

Oak12 QKVDPTKLMNDFFAAIEWLM----------------HHDSSTGKVGITGF

Marine_Oil SSMDRTKIEADFIAAAKFIK----------------SHPQCSGKLGAVGF

Oak10 MKVDGNKMREDMVAAANWLR----------------SRPDCNGKICATGF

Oak7 GTLTPEGIDSSSSAALAAVK----------------ANPACNGKAGAVGF

Oak4 SHLPPDQITADLNAVVDYAL----------------KLPASNGKLYVTGF

Oak5 SHLPPDQITADLNAVVDYAL----------------KLPASNGKLYVTGF

Fossil6 YGLDPDNVTQDLNAVLKYAK----------------SIKAGNGEIYVVGF

Fossil3 GTLNNALVMDDTRGMLDYLASA--------------PLAKAGP-RGCIGY

Oak2 NSLTNALVMEDTSAWLGFLEAQ--------------DKVKTGP-VGCVGY

Oak13 QALNETTHMSDAKAFVAWLDQQ--------------PSVAKNRKIGTQGY

Oak15 PHASPEKILADTGAFLVWLDSQ--------------PDVKPGG-IGTTGY

Aquatic_dechlorinating_2 QSATTFDFVDDARAALDFLAE---------------QPEVDARRVGVIGH

Groundwater-dechlorinating1 QSATTFDFVDDARAALDFLAE---------------QPEVDARRVGVIGH

Hot KRIYETLVVNRASTEDRMLKDLEHGY--------NYLKSMGISKYGVIGF

Formaldehyde-enrichment QTLTAAGIETDIDSALGYLH----------------ARGFEAHHCATLGF

Rainforest-Soil DGDAIAETEAGIQAAFAGMAGFTASLRKAFRYVVGERPETKGQKVACVGF

Rainforest-Soil2 LMARTGPAFDDVEAGRTWLLS----------------RRECSGRVGVIGF

PCE-dechlorinating DMTVEDEVIHDAVAAVELLRN---------------TDEVDPDRVFVVGH

Oak6 TTKELGSYDSDARAALAYLK----------------ALPVCTGKLGVMGI

Oak16 QLAGEDREEDDVVDAEDDLQTG-----------------EGGQGDEVLGR

Air2 GDDIAGWVGRAREQGSAVFEAK----------------DAQNHRRDYAGA

Oak8 VTAVKRHARQAVAGQSLLLRAY--------------QMLRRGFSHRRVGA

Saline2 ESVAEQVSVADIESGWEQVVQT-------------TGSFESVLAVEFQGI

*

Fossil4 CLGGLLAYRTACHTD--SDASVGYYGVS--------IEDRLAEAAG----

Oak9 CWGGMLAYLTAVRHK--PDAAVGYYGVG--------IEQRLDLAKN----

Air1 CLGGRLAYMTAARTD--IDASVGYYGVM--------IDQMLNESHA----

Fossil1 CLGGLLTFLSATRTD--GDAFAVYYGGG--------MDNYVGEADN----

Cyanobacterial-community CLGGKLAYLMATRSI--AECNVSYYGVG--------IEKNLDEASN----

PA2682 CMGGKLAYLAATRTD--VSCSVGYYGMG--------IEALLDEAKQ----

Uranium1 CWGGTVALLAALRLG--LPS-VSYYGAR--------NLPFLHEV------

Rainforest-Thermophiles1 CWGGSMALLAALRLG--LPS-VSYYGAR--------NLALLDECEREDPR

Acid1 CMGGRLTFLSLTTFPEKLKAGVSYYGGSIGHEGLDGLGRKEV-LSG----

Acid2 CMGGRLAFLSAGWFGEKIKAAVPFYGGGIGAPKGFFPGHTEVPLSL----

Hypersaline CFGGHVAYLAAT-LPDIKATASFYG----AGIATMTPGGNEPTISR----

Cyanobacterial-community2 CFGGHVTYLAAT-LPEIQAAASFYG----AGIATGTPGGGNPTITL----

O2-minimum CLGGRVSFVSAANVPGLAAAAVYYPG--NLVPAADAPTGTIRALEE----

Bison5 CWGGRITWLYAAHNPKVKAGVAWYG----RLTGDATANSPKHPVDV----

Uranium2 CWGGRIAWLYCAHNPAVKAGVVWYG----RLVGDKTALTPLQPLDI----

Acromyrmex CWGGRIAWLYATHNPQLQAAVAWYG----HLHPQITLRQPVTPVDA----

Page 47: Molecular Biosciences

47

Formaldehyde-enrichment2 CRAGRTIWMYTAHSKRVKAGVAWYG----SLMPFGPNATG--PLDV----

Aquatic_dechlorinating_ CFGGTVVLELARSGA-DIAGVVSFHGG---------LDT--PMPED----

Groundwater-dechlorinating CFGGTVVLELARSGA-DIAGVVSFHGG---------LDT--PMPED----

ANAS CFGGTAVLELARSGA-ELDGVVSFHGG---------LGT--QVPAT----

Oak11 CFGGMTVLELARNGE-PLRGIVTFHGA---------LST--PHPED----

Oak1 CFGGTMALELARSGA-DLGAVVGFHSG---------LGT--QRPAQ----

Pseudomonas CFGGKVVLDAARRGE-KLDGVVSFHGA---------LAT--QTPAK----

PA1166 CFGGKIVLDMARQGL-PLAGVASFHGA---------LGT--ATPAS----

Hypersaline2 CFGGGTVLQMAYGGS-DIDGVVSFHGS---------LPA--APEEV----

Oak14 CFGGCCSLELARTGA-PLKAAVSFHGT---------LDT--PNPAD----

PA1597 CFGGCCALELARDGA-ELKAFVSFHGT---------LDT--PDPAH----

Oak18 DLGGEWSLQTALQYPGALDAAVMYYGR---------GVF--MDRDR----

Saline1 CFGGGMTYRTLAEEASAFDAAVAYYGT---------PDP--LAGEA----

Oak3 CMGGGMSFRLAVGEP-TLKAAVINYG----------GVT--SDPAV----

Bison2 CCGGTCVWYFGSRIE-DFKALVPYYG----------LYK--LAEID----

Bison3 CCGGTCTWYFGSRME-DFKALVPYYG----------LYK--LAEID----

BathHotS1 CCGGTCVWYFGSKIE-DFKALVPYYG----------LYK--LAEID----

BathHotS2 CCGGTCVWYFGSRIE-DFKALVPYYG----------LYR--LAEID----

Bison1 CCGGTCVWYFGSRLE-DFKALVPYYG----------LYK--LAEID----

Bison4 CCGGTCTWYFGSRME-DFKALVPYYG----------LYK--LAEID----

Sediment_Water_Boiling CCGGTCTWYFGSKFE-EFKALVPYYG----------LYK--LANID----

Rainforest-Thermophiles2 CMGGTFSLRAACELE-GVSAAAPFYG----------DIP--DEEV-----

Oak17 CMGGGLVLVLGCLRADKISAVVPFYGV---------LGFDDDNAPD----

Pyrene-degrading CMGGALTIAA-AVHVPALSAAVCFYGI---------P---PQEFAD----

Fossil2 CYGGGVCNALAVAYPE-MGASVPFYGR---------QAS---AAD-----

Rainforest-Thermophiles CFGGWVANMMAVRVPD-LGAAVPFYGG---------QPN---DED-----

Fossil5 CYGGGVC-------------------L---------EP------------

Oak12 CYGGGVANAAAVAYPE-LGAAVSFYGR---------QPE---AKD-----

Marine_Oil CFGGYIVN------------------------------------------

Oak10 CFGGGVANFLGVRLGENLAATAPFYGG---------NPA---LPD-----

Oak7 CWGGGAVNSLAVIDQG-LGAGVAYYGS---------QPA---AAD-----

Oak4 CWGGGQSFRFATNRGD-LAAAFVFYGP----------PP---KSDD----

Oak5 CWGGGQSFRFATNRGD-LAAAFVFYGP----------PP---KSDD----

Fossil6 CWGGSQSFRFATSAGDEIEAAMVFYGT---------GPQ---EASA----

Fossil3 CMSGQYVVSAAGTFPNDFTASASLYGVG--------IVTDQPDTPH---H

Oak2 CMSGRYVTTAAARFGNRFAASASLYGVG--------IVTDAEDSPH---L

Oak13 CMGGPIAFRTAAAVPDRVGAVGSFHGGG--------LVTTTPNSPH---L

Oak15 CMGGMLSLLAAGTYPDRVVAAASFHGAR--------LATDAPDSPH---L

Aquatic_dechlorinating_2 SEGAIVASILAARGAEDGQAADENSKAG--------ARAAFIVLLG----

Groundwater-dechlorinating1 SEGAIVASILAARGAEDGQAADENSKAG--------ARAAFIVLLG----

Hot CMGGGLSFQLSTQLP--FDATVVYYGR---------NPRSIEDISR----

Formaldehyde-enrichment CMGGIVSMYAATRTA--LGAAVTFYGGG--------VATGRFGFPP-LID

Rainforest-Soil CMGGGLSALLACEEE-GLSGAAIYYG----------MPPDPGAAVS----

Rainforest-Soil2 CMGGGFALLLASGHG--FSAASVNYGG-----------PLAKDVED----

PCE-dechlorinating SLGGYLAPRIAAEAG-HVAGVVILAGH-------------VRPLQD----

Oak6 CIGGHLSFRAAMNPEALAGVCFYATDIHKRGLG---KGTHDNTLDR----

Oak16 EQGGEEVGHGAGSSDAPVHVQCRCPGAAR------------PPAAG----

Air2 ALAGGDLYALLSAPAPGLLSWARLNPIG----------------------

Oak8 RFFESVSRQAQQFAPLFLHNIHSKGGTG--------MGKFIELKAS----

Saline2 ESGVAVVRVETAHTLARNTWQVSLNDEG-------ILGSVTTGQEP---Y

Fossil4 ----ISAPLMLHIAGADQFVPAAAQARLH---DGLGSNPHVTLHDYPGKE

Oak9 ----LSCPLMLHYAELDQYASPEVAAKVR---ATYQGDPRVTVWEYPKVG

Air1 ----IAHPLMLHIPTEDHLVDHDAQKKIH---EGLDPHPKVTLHDYQGLD

Fossil1 ----IKQPVIIHLAGNDEYIPAEAQDVIK---DALADHSLTELHFYPGRD

Cyanobacterial-community ----IEHPLMLHIAEKDDYVSPEIQSQLK---VELRNYSLVEIHSYPNVN

PA2682 ----IKGRLVLHFAEQDAYCPQQARDAIL---PCLRNLPKTELYLYPGVD

Uranium1 ----PKAPVLFHFGEKDQHITPEMVQKHR---DALP---QMDVYTYP-AD

Rainforest-Thermophiles1 LVAEPKAMVMFHFGEQDPSIPAEAVAAHR---QRLP---QMPLFVYP-AG

Page 48: Molecular Biosciences

48

Acid1 -AGRLKSPILLLYGAKDDSIPSEEHGRIAKTLSALDKT--YLLSVYPDAP

Acid2 -VPGIRADLLLLYGGKDDFIPEEERNAVAKALSAANRS--FRMETFPDAG

Hypersaline -TQDITGTIHLFFGLDDASIPAEQVNQIEAELKKHQIA--HQIFRYEGAD

Cyanobacterial-community2 -TPKISGTIYCFFGTEDPLIPIEQVDQIEAELQKHQIK--HRVFRYP-AN

O2-minimum -CGKLDIPIIGFFGNNDANPTPEIVGQLDAELTKLGKQ--HDFNAYDDAG

Bison5 -AQGLKVPVLGLYGGKDTGIPLESVERMKTELAKGNSR--SEIVVFQPSG

Uranium2 -AYTLKTPVLGLYGAQDSSISQDSIDLMWQTLIHAGNH--SMFVVYPDAG

Acromyrmex -AASLTAPVLGLYGALDPMITAENVALMQQALRAANSD--SEIITYPDAG

Formaldehyde-enrichment2 -TDRLNAPVLGLYGGADAGIPLAHVERMRAGLFAFGKDKQSPIHVYPDAG

Aquatic_dechlorinating_ -AKNIKCKVLVCTGGDDPNVPPKQVEAFEKEMRDANVD--WQVKSYGGAV

Groundwater-dechlorinating -AKNIKCKVLVCTGGDDPNVPPKQVEAFEKEMRDANVD--WQVKSYGGAV

ANAS -AGGIKAKVLVLHGADDPSVPPAEVQAFQDEMRKSGAD--WQMVAYGGAV

Oak11 -ARQIKGKVLVLHGAHDTFVGSDEVAVFEADMKAAGVD--YRIIRYPDAV

Oak1 -PGEVKAAILVCIGADDPLVPAEQRAAFEAEMRVAQVD--WRMNLYGGAM

Pseudomonas -PGVVRADILVEHGAADSMVTPQQVEAFKAEMDAAKVN--YQFVSIEGAK

PA1166 -KGSVKAKILVEHGSADSLVPAKDLDALKQELSAAGAD--YRVVIQDGAK

Hypersaline2 -YGKIKPEILVLHGQADSFVAPEVVTNFQDKLEAAGAN--WEMDIYGGAR

Oak14 -AKNIKGSVLVLHGASDPLVPKEQLPAFEDEMNAAGVD--WQLLSYGGAV

PA1597 -ARNIKGAVLVLDGASDPLVPREQLPAFAREMTDAGVD--WQLTSYGGAV

Oak18 -LAPLNVPVLGFYGGDDKSIPVRQVQEFRARLLELGKN--AEVLIVPHAD

Saline1 -LQALETPILAHFGTQDQAVPIDAARKFRDRMEDAGT---SLAYHEYDAG

Oak3 -LGKIHASILGIFGGKDRGIPLDDVTAMAAE-------------------

Bison2 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD

Bison3 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD

BathHotS1 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVD--AQFLIYAGVD

BathHotS2 -FSKIKAPVLAVHAGMDAFIPLSEVMEAIQKCNENKVN--AQFLIYAGVD

Bison1 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD

Bison4 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD

Sediment_Water_Boiling -FSKIKAPVLAVHAGMDAFIPLSEVMEAIQKCNENKLN--AQFLIYAGVD

Rainforest-Thermophiles2 -LKKLNVPTVFISGTRDQWINTEKVGQLEDIAERNELP--LESLKYD-AD

Oak17 -WSKLAARVEGHYAETDGFFPVEKVQALEAGLKGLGKQ--VSLHVYPGTG

Pyrene-degrading -PANIRIPFQGHFATQDDWCVPSMVDALEATLQKTGVS--AEIHRYEAA-

Fossil2 -VPKIQAPLMLQYGGLDERVN-AGWPEYEAALKANNKE--YIAHFYEGAN

Rainforest-Thermophiles -VPKIKAPLLIHYAELDTRVN-EGWPAYESALKAHNKE--YTMYMYPKAN

Fossil5 ---------ILTYGG-----------------------------------

Oak12 -VRRIKAPIMLHYGELDTRIN-EGWPAY----------------------

Marine_Oil --------------------------------------------------

Oak10 -VPKIKAAVLVHHGELDTRLA-MAWPAYQAELNKAKIP--NEGHIYPGAV

Oak7 -VPKITAPLLLHYGSLDERID-AGIPAFEAALKAASKT--YEIHMYEGAN

Oak4 -MPRIKAPVYGFYAGNDARIG-ATIPDAEKQMKTADIP------------

Oak5 -MPRIKAPVYGFYAGNDARIG-ATIPDAEKQMKTADIP------------

Fossil6 -YASIKVPVHGFYGGADNRVN-ATIEGSEKAMNTYDKT--FTYEIYDGAG

Fossil3 LASKIKGELYLGFAETDMYVPDNVIPELRAALDQHKVD--YRLDTWPGTE

Oak2 LVDKIKGEMYYGFAEIDEHVPEKVIPTLRQGLDKAGVR--YGLDVFAGAR

Oak13 QAAKTKAQF-----------------------------------------

Oak15 LAPKMKARVYVAGAMEDMFFTDDMKARLEEALTKAGVD--HVVETYP-AR

Aquatic_dechlorinating_2 -GPGVRGDELLLMQSAALGRAMGVSEEQISEANRLNREL-YSIAMSDGDV

Groundwater-dechlorinating1 -GPGVRGDELLLMQSAALGRAMGVSEEQISEANRLNREL-YSIAMSDGDV

Hot ----LKGPVLGIYAGEDSAINNGVPDMVRAMFQYGKEL---EMKIYPKTY

Formaldehyde-enrichment IAPMLQTPWLGLYGDLDKGIPFDDVELLRTAASNAPVP--TTVVRYADAD

Rainforest-Soil ----IRCPVIGFYGAEDGRVN-AGLPAFEAALAVAKAS--FEKRMYPGAG

Rainforest-Soil2 -FLQTACPVVGSYGGLAQWERGVADQLAAALERALVAH---DVKEYPDAG

PCE-dechlorinating -LIVAQTEYLLGLDGELSDEDRAQLEQIEQIKGAIDHLDAAAPGMYFGAP

Oak6 -VKEIQGEMLMIWGRQDPHVPREGRALIYNAMTDAGLN--FTWHEFNGQH

Oak16 LAPRPGYHLAMIISDNETVDLATPAGPMRTHVFRPAAPGRYPGLILYSEI

Air2 -------TLLLPLGAWLTAFAAVMLLSERIFIRWLDYLERVAAIYARGRF

Oak8 -DGHKFAAYLAESSGKARGGVVVIPEIFGVNSHIKQTSDGYAADGYRVVS

Saline2 EWDPPAYAEESAFEEMAVTLDAPGSCELGGTLSVPTGEASVPGVVIVHGS

Page 49: Molecular Biosciences

49

Fossil4 HAFARPDGL------------HYAADSANLANQRTLDFLHRNLD------

Oak9 HAFARPGGG------------HFDARAADLADMRTLAFLVQKMVGHDKRG

Air1 HGFAATMGN------------RRNEEGAQLADGRTRAFFAEHLA------

Fossil1 HAFARKGGA------------HYDKNDAETANARTAEFFRANLV------

Cyanobacterial-community HGFARIGGK------------DYDLEAANLAHSRTMEFFQKYLG------

PA2682 HAFARVGGM------------HFDKPAYLMAHERSIAALKREIGPNFDLS

Uranium1 HAFNRDGSS------------PYHEASAKLALQRTLAFFEQHLDGA----

Rainforest-Thermophiles1 HAFNRDVDPR-----------AYHEPSARLARERSLEFLAARLGAAA---

Acid1 HGFSCWQRS------------SYREEAAMPAWKLARFFLENTLKNAGK--

Acid2 HGFFCEDRP------------SYHKESADRSEILLREFLDRHLKSAPSRN

Hypersaline HGFFCDQRA------------SYNPQAAKDAWEKVKTLFQQEL-------

Cyanobacterial-community2 HGFFCDHRS------------SYNAQAATDAWIKTKELFDEQL-------

O2-minimum HGFFCDARD------------SYRADAAKDAWAKTLAFFDRHLGGGSATS

Bison5 HAFHADYRP------------SYNEADAKDGWKRALNWFAKHGVAF----

Uranium2 HAFYADYRP------------SYVEADAKDGWRRALAWFKGHGVV-----

Acromyrmex HAFHADYRP------------NYHAESAQDGWQRMLEWFGRYGVAPAHPG

Formaldehyde-enrichment2 HGFHADYRP------------SYRKQDA----------------------

Aquatic_dechlorinating_ HSFTNPASG-----NDNSKGAAYNEKADKRSWEDMKLFFNEIFK------

Groundwater-dechlorinating HSFTNPASG-----NDNSKGAAYNEKADKRSWEDMKLFFNEIFK------

ANAS HTFTNPAAG-----NDPSRGSAYNEKAALRSWEHMKAFFAEIFR------

Oak11 HSFTVPEAG-----DDPSKGMAYNPDADRQSWEAM---------------

Oak1 HSFTNPDAT-----VSDFPGVAYHQPTDERSWRAMLDFFEEVF-------

Pseudomonas HGFTNPDADRLSHGEHGGPDIGYNKAADERSWADMQAFFKKVFK------

PA1166 HGFTNPDAD--AHKGHG-LDIGYDRQADRRSWADLQAFLKDIFGQG----

Hypersaline2 HGFTNPDAG-----DYGIDNLKHDPQADARSWARMQSFFNELFAD-----

Oak14 HSFTDPHAN-------VPGMMMYDAKTAARAFQSMHNLLDEVFKG-----

PA1597 HSFTDPNAK-------LPGKMHYDARTSRRAFQAMDDLLAEVFA------

Oak18 HSFANPSSA------------TYNAQAANEAWTATLAFLERHLKLDTPTR

Saline1 HAFANPSGE------------SYEPAAAEQAWTRTTDFLQTHLTR-----

Oak3 --------------------------------------------------

Bison2 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------

Bison3 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------

BathHotS1 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------

BathHotS2 HAFFNDTRP-----------EVYHEEYAKDVWQKTIEFFRTHLL------

Bison1 HAFFNDTRP-----------EVYHEEYAKDVWQKTIQFFRTHLL------

Bison4 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------

Sediment_Water_Boiling HAFFNDTRP-----------EVYHEEYARDVWQKTIEFFRTHLT------

Rainforest-Thermophiles2 HAFFNNTRP-----------EVYNETAARDAWAKVIGFFNDKL-------

Oak17 HAFANETNALG----------TYDADAAQLAWERSVTFLHDNLG------

Pyrene-degrading HGFFNERRGD-----------VYNANAASQAWERMIAFFTRHLD------

Fossil2 HGFHNDS-TPRYD-----------EAQAALAWQRTIDFFGEKLA------

Rainforest-Thermophiles HGFHNDT-TPRYD-----------KESAELAWKRTIDFFNQKLK------

Fossil5 --------------------------------------------------

Oak12 --------------------------------------------------

Marine_Oil --------------------------------------------------

Oak10 HGFNCDA-TPERY-----------NK------------------------

Oak7 HAFNNDTNAARYN-----------KDAAELGWSRTVAFLKKHVS------

Oak4 --------------------------------------------------

Oak5 --------------------------------------------------

Fossil6 HAFMRSGDDPNAE-----------IGNPNVAARNASWERLLNIIKGNQPE

Fossil3 HGFCFPERA------------AYVEAAAEG-------VWKLGL-------

Oak2 HGFQFPERD------------VYDTHAAEASWAKIVAMWDRNLK------

Oak13 --------------------------------------------------

Oak15 HGWVPRDTP------------VHD--------------------------

Aquatic_dechlorinating_2 PTLREKVVR-----------------VMEDPIDSTSDLTEQRRCL-----

Groundwater-dechlorinating1 PTLREKVVR-----------------VMEDPIDSTSDLTEQRRCL-----

Hot HAFATEGGP------------VYNEAAAKDAWDRTVRFFTRILG------

Formaldehyde-enrichment HGFNCDDRP-----------AVYNAVAADDAWQRTLAW------------

Rainforest-Soil HGFFCDDRP------------SYHEGAARDSYWRLLQFFSRVLSG-----

Rainforest-Soil2 HSFMNRHRGYGFLRIVQLRSIGYNEPATMDARRRIVAFFNLHLKEQYSNA

Page 50: Molecular Biosciences

50

PCE-dechlorinating PAYWVDLRKYDPVKTARALDKPLLILQGERDYQVTMDDFARWQEGLDGQD

Oak6 AFLRDEGYR-------------YDPALAHLGYTLVLQLFRRKLGEG----

Oak16 FQVTGPIRR-------------SAAMLAGHGFVVAVPEGYHELEAPG---

Air2 SVRPLQAMN---------------APAEIRTMARTLDEMA----------

Oak8 PAMFDRAQRN------YATGYSQPEIEAGRAIMQKLDWKQAILDVQAA--

Saline2 GPVDRDGTYGSNKPYKELAWGLASRGIAVLRYDKRTDACDVALSDLTI--

Fossil4 --------------------------------------------------

Oak9 LL------------------------------------------------

Air1 --------------------------------------------------

Fossil1 --------------------------------------------------

Cyanobacterial-community --------------------------------------------------

PA2682 ALWDEHVRHEFDTRDVAATMATMVAEPYVNHVPTLTGGVGQRELSRFYRH

Uranium1 --------------------------------------------------

Rainforest-Thermophiles1 --------------------------------------------------

Acid1 --------------------------------------------------

Acid2 G-------------------------------------------------

Hypersaline --------------------------------------------------

Cyanobacterial-community2 --------------------------------------------------

O2-minimum SR------------------------------------------------

Bison5 --------------------------------------------------

Uranium2 --------------------------------------------------

Acromyrmex S-------------------------------------------------

Formaldehyde-enrichment2 --------------------------------------------------

Aquatic_dechlorinating_ --------------------------------------------------

Groundwater-dechlorinating --------------------------------------------------

ANAS --------------------------------------------------

Oak11 --------------------------------------------------

Oak1 --------------------------------------------------

Pseudomonas --------------------------------------------------

PA1166 --------------------------------------------------

Hypersaline2 --------------------------------------------------

Oak14 --------------------------------------------------

PA1597 --------------------------------------------------

Oak18 PQ------------------------------------------------

Saline1 --------------------------------------------------

Oak3 --------------------------------------------------

Bison2 --------------------------------------------------

Bison3 --------------------------------------------------

BathHotS1 --------------------------------------------------

BathHotS2 --------------------------------------------------

Bison1 --------------------------------------------------

Bison4 --------------------------------------------------

Sediment_Water_Boiling --------------------------------------------------

Rainforest-Thermophiles2 --------------------------------------------------

Oak17 --------------------------------------------------

Pyrene-degrading --------------------------------------------------

Fossil2 --------------------------------------------------

Rainforest-Thermophiles --------------------------------------------------

Fossil5 --------------------------------------------------

Oak12 --------------------------------------------------

Marine_Oil --------------------------------------------------

Oak10 --------------------------------------------------

Oak7 --------------------------------------------------

Oak4 --------------------------------------------------

Oak5 --------------------------------------------------

Fossil6 R-------------------------------------------------

Fossil3 --------------------------------------------------

Oak2 --------------------------------------------------

Page 51: Molecular Biosciences

51

Oak13 --------------------------------------------------

Oak15 --------------------------------------------------

Aquatic_dechlorinating_2 --------------------------------------------------

Groundwater-dechlorinating1 --------------------------------------------------

Hot --------------------------------------------------

Formaldehyde-enrichment --------------------------------------------------

Rainforest-Soil --------------------------------------------------

Rainforest-Soil2 PTGA----------------------------------------------

PCE-dechlorinating DVTFIRY-------------------------------------------

Oak6 --------------------------------------------------

Oak16 --------------------------------------------------

Air2 --------------------------------------------------

Oak8 --------------------------------------------------

Saline2 --------------------------------------------------

Fossil4 --------------------------------------------------

Oak9 --------------------------------------------------

Air1 --------------------------------------------------

Fossil1 --------------------------------------------------

Cyanobacterial-community --------------------------------------------------

PA2682 HFIHGNPPDMTLTPISRTVGALQVVDEFVMRFTHSCEIDWLLPGVPPTGR

Uranium1 --------------------------------------------------

Rainforest-Thermophiles1 --------------------------------------------------

Acid1 --------------------------------------------------

Acid2 --------------------------------------------------

Hypersaline --------------------------------------------------

Cyanobacterial-community2 --------------------------------------------------

O2-minimum --------------------------------------------------

Bison5 --------------------------------------------------

Uranium2 --------------------------------------------------

Acromyrmex --------------------------------------------------

Formaldehyde-enrichment2 --------------------------------------------------

Aquatic_dechlorinating_ --------------------------------------------------

Groundwater-dechlorinating --------------------------------------------------

ANAS --------------------------------------------------

Oak11 --------------------------------------------------

Oak1 --------------------------------------------------

Pseudomonas --------------------------------------------------

PA1166 --------------------------------------------------

Hypersaline2 --------------------------------------------------

Oak14 --------------------------------------------------

PA1597 --------------------------------------------------

Oak18 --------------------------------------------------

Saline1 --------------------------------------------------

Oak3 --------------------------------------------------

Bison2 --------------------------------------------------

Bison3 --------------------------------------------------

BathHotS1 --------------------------------------------------

BathHotS2 --------------------------------------------------

Bison1 --------------------------------------------------

Bison4 --------------------------------------------------

Sediment_Water_Boiling --------------------------------------------------

Rainforest-Thermophiles2 --------------------------------------------------

Oak17 --------------------------------------------------

Pyrene-degrading --------------------------------------------------

Fossil2 --------------------------------------------------

Rainforest-Thermophiles --------------------------------------------------

Fossil5 --------------------------------------------------

Oak12 --------------------------------------------------

Page 52: Molecular Biosciences

52

Marine_Oil --------------------------------------------------

Oak10 --------------------------------------------------

Oak7 --------------------------------------------------

Oak4 --------------------------------------------------

Oak5 --------------------------------------------------

Fossil6 --------------------------------------------------

Fossil3 --------------------------------------------------

Oak2 --------------------------------------------------

Oak13 --------------------------------------------------

Oak15 --------------------------------------------------

Aquatic_dechlorinating_2 --------------------------------------------------

Groundwater-dechlorinating1 --------------------------------------------------

Hot --------------------------------------------------

Formaldehyde-enrichment --------------------------------------------------

Rainforest-Soil --------------------------------------------------

Rainforest-Soil2 --------------------------------------------------

PCE-dechlorinating --------------------------------------------------

Oak6 --------------------------------------------------

Oak16 --------------------------------------------------

Air2 --------------------------------------------------

Oak8 --------------------------------------------------

Saline2 --------------------------------------------------

Fossil4 --------------------------------------------------

Oak9 --------------------------------------------------

Air1 --------------------------------------------------

Fossil1 --------------------------------------------------

Cyanobacterial-community --------------------------------------------------

PA2682 FVEIPMLGVVRFRGDRLYHEHIYWDQAGVLVQIGLLDPQGLPVAGVESAR

Uranium1 --------------------------------------------------

Rainforest-Thermophiles1 --------------------------------------------------

Acid1 --------------------------------------------------

Acid2 --------------------------------------------------

Hypersaline --------------------------------------------------

Cyanobacterial-community2 --------------------------------------------------

O2-minimum --------------------------------------------------

Bison5 --------------------------------------------------

Uranium2 --------------------------------------------------

Acromyrmex --------------------------------------------------

Formaldehyde-enrichment2 --------------------------------------------------

Aquatic_dechlorinating_ --------------------------------------------------

Groundwater-dechlorinating --------------------------------------------------

ANAS --------------------------------------------------

Oak11 --------------------------------------------------

Oak1 --------------------------------------------------

Pseudomonas --------------------------------------------------

PA1166 --------------------------------------------------

Hypersaline2 --------------------------------------------------

Oak14 --------------------------------------------------

PA1597 --------------------------------------------------

Oak18 --------------------------------------------------

Saline1 --------------------------------------------------

Oak3 --------------------------------------------------

Bison2 --------------------------------------------------

Bison3 --------------------------------------------------

BathHotS1 --------------------------------------------------

BathHotS2 --------------------------------------------------

Bison1 --------------------------------------------------

Bison4 --------------------------------------------------

Page 53: Molecular Biosciences

53

Sediment_Water_Boiling --------------------------------------------------

Rainforest-Thermophiles2 --------------------------------------------------

Oak17 --------------------------------------------------

Pyrene-degrading --------------------------------------------------

Fossil2 --------------------------------------------------

Rainforest-Thermophiles --------------------------------------------------

Fossil5 --------------------------------------------------

Oak12 --------------------------------------------------

Marine_Oil --------------------------------------------------

Oak10 --------------------------------------------------

Oak7 --------------------------------------------------

Oak4 --------------------------------------------------

Oak5 --------------------------------------------------

Fossil6 --------------------------------------------------

Fossil3 --------------------------------------------------

Oak2 --------------------------------------------------

Oak13 --------------------------------------------------

Oak15 --------------------------------------------------

Aquatic_dechlorinating_2 --------------------------------------------------

Groundwater-dechlorinating1 --------------------------------------------------

Hot --------------------------------------------------

Formaldehyde-enrichment --------------------------------------------------

Rainforest-Soil --------------------------------------------------

Rainforest-Soil2 --------------------------------------------------

PCE-dechlorinating --------------------------------------------------

Oak6 --------------------------------------------------

Oak16 --------------------------------------------------

Air2 --------------------------------------------------

Oak8 --------------------------------------------------

Saline2 --------------------------------------------------

Fossil4 ------------------------

Oak9 ------------------------

Air1 ------------------------

Fossil1 ------------------------

Cyanobacterial-community ------------------------

PA2682 KLLDESLPSNRLMARWAASEGLGL

Uranium1 ------------------------

Rainforest-Thermophiles1 ------------------------

Acid1 ------------------------

Acid2 ------------------------

Hypersaline ------------------------

Cyanobacterial-community2 ------------------------

O2-minimum ------------------------

Bison5 ------------------------

Uranium2 ------------------------

Acromyrmex ------------------------

Formaldehyde-enrichment2 ------------------------

Aquatic_dechlorinating_ ------------------------

Groundwater-dechlorinating ------------------------

ANAS ------------------------

Oak11 ------------------------

Oak1 ------------------------

Pseudomonas ------------------------

PA1166 ------------------------

Hypersaline2 ------------------------

Oak14 ------------------------

PA1597 ------------------------

Oak18 ------------------------

Page 54: Molecular Biosciences

54

Saline1 ------------------------

Oak3 ------------------------

Bison2 ------------------------

Bison3 ------------------------

BathHotS1 ------------------------

BathHotS2 ------------------------

Bison1 ------------------------

Bison4 ------------------------

Sediment_Water_Boiling ------------------------

Rainforest-Thermophiles2 ------------------------

Oak17 ------------------------

Pyrene-degrading ------------------------

Fossil2 ------------------------

Rainforest-Thermophiles ------------------------

Fossil5 ------------------------

Oak12 ------------------------

Marine_Oil ------------------------

Oak10 ------------------------

Oak7 ------------------------

Oak4 ------------------------

Oak5 ------------------------

Fossil6 ------------------------

Fossil3 ------------------------

Oak2 ------------------------

Oak13 ------------------------

Oak15 ------------------------

Aquatic_dechlorinating_2 ------------------------

Groundwater-dechlorinating1 ------------------------

Hot ------------------------

Formaldehyde-enrichment ------------------------

Rainforest-Soil ------------------------

Rainforest-Soil2 ------------------------

PCE-dechlorinating ------------------------

Oak6 ------------------------

Oak16 ------------------------

Air2 ------------------------

Oak8 ------------------------

Saline2 ------------------------

CLUSTAL W (1.82) multiple sequence alignment

Page 55: Molecular Biosciences

55

Page 56: Molecular Biosciences

56

Page 57: Molecular Biosciences

57

Page 58: Molecular Biosciences

58

X. References

Articles & Literature

1. Altschul S. F., Gish W., Miller W., Myers E. W. & Lipman D. J., (1990). Basic

Local Alignment Search Tool, J. Mol. Biol. 215. 403-410

2. Amann, R. I., Ludwig, W. & Schleifer, K. H., (1995). Phylogenetic Identification

and In situ Detection of Individual Microbial Cells Without Cultivation, Microbiol. Rev.

59: 143-169

3. Anzai Y., Kim H., Park J-Y. , Wakabayashi H. & Oyaizu H., (2000).

Phylogenetic Affiliation of the Pseudomonads Based on 16S rRNA Sequence,

International Journal of Systematic & Evolutionary Microbiology, 50, 1563-1589

4. Atlas, Ronald M. & Bartha, Richard, (1998). Microbial Ecology, Fundamentals

and Applications, 4th Ed. Benjamin/Cummings Publishing Company, Inc. Citing:

Dagley, S. (1975) Essays in Biochemistry. Vol.2 pp. 81-130

5. Backeljau, T., De Bruyn, L., De Wolf, K. & Jordaens, K. (1996). Multiple

UPGMA and Neighbor-joining Trees and the Performance of Some Computer

Packages. Mol. Biol. Evol. 13(2):309-313

6. Berman H. M. , Westbrook J. , Feng Z. , Gilliland G. , Bhat T. N. , Weissig H. ,

Shindyalov I. N. & Bourne P. E., (2000). The Protein Data Bank, Oxford University

Press, Nucleic Acids Research, Vol. 28, No.1 235-242

7. Beveridge A. J. & Ollis D. L., (1994). A Theoretical Study of Substrate-induced

Activation of Dienelactone Hydrolase, Oxford Journals, Oxford University Press

8. Bruno, W. J., Socci, N. D. & Halpern, A. L., (2000). Weighted Neighbor Joining: A

Likelihood-Based Approach to Distance-Based Phylogeny Reconstruction, Mol. Biol.

Evol. 17(1):189-19

9. Cheah, E., Ashley, G. W., Gary, J. & Ollis, D., (1993). Catalysis by Dienelactone

Hydrolase: A Variation on The Protease Mechanism, J. Mol. Biol., PubMed 16(1):64-

78

10. Duncan, T., Phillips, R. B. & Wagner Jr., W. H., (1980). A Comparison of

Branching Diagrams Derived by Various Phenetic and Cladistic Methods, Systematic

Botany, 5(3): pp.264-293

11. Dybas, M. J., Hyndman, D. W., Heine, R. et al., (2002). Development, Operation

and Long-term Performance of a Full-scale Biocurtain Utilizing Bioaugmentation,

Environmental Science and Technology 36:3635-3644

12. Elzerman, A.W. & Coates, J.T., (1987). Hydrophobic Organic Compounds on

Sediments, Equilibria and Kinetics of Sorption, Sources and Fates of Aquatic Pollutants,

pp. 263-317

13. Fardeau, M. L., Salinas, M. B., L’Haridon, S. et al. (2004). Isolation from oil

reservoirs of novel thermophilic anaerobes phylogenetically related to

Thermoanaerobacter subterraneus: reassignment of T. subterraneus,

Page 59: Molecular Biosciences

59

Thermoanaerobacter yonseisensis, Thermoanaerobacter tengcongensis and

Carboxydibrachium pacificum to Caldanaerobacter subterraneus gen, nov., sp. Nov.,

comb. Nov., as four novel subspecies. International Journal of Systematic and

Evolutionary Microbiology, 54:467-474

14. Fetzner S., (2001). Biodegradation of Xenobiotics, , Biotechnology, Vol X,

Encyclopedia of Life Support Systems (EOLSS), Department of Microbiology,

University of Oldenburg, D-26111 Oldenburg, Germany

15. Handelsman, J., (2004). Metagenomics: Application of Genomics to Uncultured

Microorganisms, Microbiol. Mol. Biol. Rev., Vol. 68, No.4, p.669-685

16. Harayama, S., Kishira, H., Kasai, Y. & Shutsubo, K., (1999). Petroleum

biodegradation in marine environments. J. Mol. Microbiol. and Biotechnol. 1:63-70

17. Ibanez, J.G., Hernandez-Esperaza, M. & Dorria-Serrano, C., (2007).

Environmental Chemistry, Fundamentals, Springer Science Business Media, LLC. Page

232 Bioconcentration, bioaccumulation and biomagnification

18. Juhasz, A. L., Naidu, R., (2000). Bioremediation of High Molecular Weight

Polycyclic Aromatic Hydrocarbons: A Review of the Microbial Degradation of

Benso(a)Pyrene, Elsevier, International Biodeterioration & Biodegradation 45 (2000)

57-88

19. Klöppfer, W., (1994). Environmental Hazard- Assessment of Chemicals and

Products, Environ. Sci. & Pollut. Res. 1 (2)108-116

20. Lorenz, P., Liebeton, K., Niehaus, F. & Eck, J., (2002). Screening for Novel

Enzymes for Biocatalytic Processes: Accessing the Metagenome as a Resource of Novel

Functional Sequence Space, Curr. Opin. Biotechnol. 13: 572-577

21. Machackova, J., Wittlingerova, Z., Vlk, K., Zima, J. & Linka, A., (2008).

Comparison of two methods for assessment of in situ jet-fuel remediation efficiency,

Water, Air and Soil Pollution 187:181-194

22. Madsen E. L., (2011). Microorganisms and their roles in fundamental

biogeochemical cycles, Elsevier, Cornell University Vol. 22, Issue 3, pages 456-464

23. Markowitz, V. M., Mavromatis, K., Ivanova, N. N., Chen, I. M., Chu, K.,

Kyrpides, N. C., (2009). IMG ER: A System For Microbial Genome Annotation Expert

Review and Curation, Bioinformatics 2009, 25(17):2271-2278

24. Morris, B. L., Lawrence, A. R. L., Chilton, P. J. C. et al. (2003). Groundwater

and its susceptibility to degradation: a global assessment of the problem and options for

management. Early Warning and Assessment Report Series RS.03-3, United Nations

Environment Programme, Nairobi, Kenya.

25. National Research Council, (2007). The New Science of Metagenomics, The

National Academies Press

26. Page, R. & Holmes, E., (1998). Molecular Evolution: A Phylogenetic Approach,

Blackwell Science, Chapter 2, ISBN 0-86542-889-1

Page 60: Molecular Biosciences

60

27. Pathak, D. & Ollis, D., (1990). Refined Structure of Dienelactone Hydrolase at

1.8Å, J. Mol. Biol., Elsevier, Vol.214, Issue 2, pages 497-525

28. Paul, E. A. & Clark, F. E., (1989). Soil Microbiology and Biochemistry, pp.11-31,

Academic Press, New York

29. Pepper, I. L., Gentry, T. J., Newby, D. B., Roane, T. M. & Josephson, K. L.,

(2002). The Role of Cell Bioaugmentation and Gene Bioaugmentation in the

Remediation of Co-contaminated Soils, Application of Technology to Chemical-mixture

Research, Environ. Health Perspect.110(suppl.6):943-946

30. Philp, J. C., Atlas, R. M. & Cunningham, C. J., (2009). Bioremediation,

Encyclopedia of Life Sciences, John Wiley & Sons, Ltd. Online posting date: 15th

March 2009

31. Potrawfke, T., Timmis, K. N. & Wittich, R-M., (1998). Degradation of 1,2,3,4-

Tetrachlorobenzene by Pseudomonas chlororaphis RW71, Applied and Environmental

Microbiology, 64(10):3798 Vol 64. No. 10, p. 3798-3806

32. Puls, R. W., Paul, C. J. & Powell, R. M., (1999). The Application of In-Situ

Permeable Reactive (Zero-Valent Iron) Barrier Technology For the Remediation of

Chromate-contaminated Groundwater: A Field Test, Elsevier, Vol, 14, Issue 8, pp. 989-

1000

33. Raghavan, P. U. M. & Vivekanandan, M., (1999). Bioremediation of oil-spilled

sites through seeding of naturally adapted Pseudomonas putida, Elsevier, International

Biodeterioration & Biodegradation 44 (1999) 29-32

34. Rappe, M. S. & Giovannoni S. J., (2003). The Un-Cultured Microbial Majority.

Annu. Rev. Microbiol. 57:369-94

35. Reineke, W., Mandt, C., Kaschabek, S. R. & Pieper, D. H., (2011). Chlorinated

Hydrocarbon Metabolism, In: eLS. John Wiley & Sons, Ltd: Chichester. DOI:

10.100279780470015902.a0000472.pub3

36. Riesenfeld, C. S., Schloss, P. D. & Handelsman, J., (2004). Metagenomics:

Genomic Analysis of Microbial Communities, Annu. Rev. Genet. 2004. 38:525-52

37. Rothmel, R. K. & Chakrabarty, A. M., (1990). Microbial Degradation of

Synthetic Recalcitrant Compounds, Pure & Appl. Chem., Vol. 62, No. 4, pp. 769-779

38. Saitou, N. & Nei, M., (1987). The Neighbor-joining Method: A New Method for

Reconstructing Phylogenetic Trees. Mol. Biol. Evol. 4(4):406-425

39. Saitou, N. & Imanishi, T., (1989). Relative Efficiencies of the Fitch-Margoliash,

Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-

joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree,.

Mol. Biol. Evol. 6(5):514-525

40. Schlömann, M., (1994). Evolution of chlorocatechol catabolic pathways,

Biodegradation Kluwer Academic Publishers 5: 301-321

41. Schlömann, Schmidt & Knackmuss, (1990) Different Types of Dienelactone

Hydrolase in 4-Fluorobenzoate Utilizing Bacteria, J. Bacteriol., 172(9):5112

Page 61: Molecular Biosciences

61

42. Schreiber, Hellwig, Dorn, Reineke & Knackmuss, (1980). Critical reactions in

fluorobenzoic acid degradation by Pseudomonas sp. B13, Applied and Environmental

Microbiolology, Vol. 39, No. 1, p. 58-67

43. Singh, D. P. & Dwivedi, S. K. (2004) Environmental Microbiology and

Biotechnology, First Edition, New Age International (P) Ltd, Publishers. Page 60

44. Stein, J. L., Marsh, T. L., Wu, K. Y., Shizuya, H. & DeLong, E. F., (1996).

Characterization of Uncultivated Prokaryotes: Isolation and Analysis of a 40- kilobase-

pair Genome Fragment Front a Planktonic Marine Archaeon, J. Bacteriol. 178: 591-

599

45. Technical Report LBNL-63614, (2008). Using IMG-M, Comparative Analysis with

the IMG/M System, Addendum to Using IMG, Department of Energy Joint Genome

Institute, Lawrence Berkeley National Laboratory

46. Thomas, T., Gilbert, J. & Meyer, F. (2012). Metagenomics - A Guide From

Sampling to Data Analysis, Microbial Informatics and Experimentation 2012, 2:3

47. Thompson J. D. , Higgins D. G. & Gibson T. J., (1994). CLUSTAL W: improving

the sensitivity of progressive multiple sequence alignments through sequence weighting,

position-specific gap penalties and weight matrix choice, Oxford University Press,

Nucleic Acids Research, Vol.22, No.22, p. 4673-4680

48. U.S. Environmental Protection Agency, (2000). Engineered Approaches to In Situ

Bioremediation of Chlorinated Solvents: Fundamentals and Field Applications, Office

of Solid Waste and Emergency Response, Technology Innovation Office, Washington

49. Zeyaullah, Md, Kamli, M.R., Islam,B., Atif, M., Benkhayal, F.A., Nehal, M.,

Rizvi, M.A. & Ali, A. (2009). Metagenomics - An Advanced Approach For Non-

Cultivable Microorganisms, Biotechnology and Molecular Biology Reviews Vol. 4 (3),

pp. 049-054

Figures-references

Fig. 1:

50. Moiseeva, Solyanikova, Kaschabek, Gröning, Thiel, Golovleva & Schlömann,

(2002). A New Modified ortho Cleavage Pathway of 3-Chlorocatechol Degradation by

Rhodococcus opacus 1CP: Genetic and Biochemical Evidence, Journal of Bacteriology,

Vol. 184, No. 19, p. 5282-5292

Fig. 2:

51. Image of 1ZI6 (Following directed evolution with crystallography: structural

changes observed in changing the substrate specificity of dienelactone hydrolase.

(2005) Acta Crystallogr.,Sect.D61: 920-931, Kim, H.K., Liu, J.W., Carr, P.D., Ollis,

D.L.) created with J.mol (J.L. Moreland, A. Gramada, O.V. Buzko, Q. Zhang, P.E.

Bourne (2005) The Molecular Biology Toolkit (MBT): a modular platform for

developing molecular visualization applications. BMC Bioinformatics 6:21)

Page 62: Molecular Biosciences

62

Fig. 3:

51. Thomas, T., Gilbert, J. & Meyer, F. (2012). Metagenomics - A Guide From

Sampling to Data Analysis, Microbial Informatics and Experimentation 2012, 2:3

Databases

1) Protein Data Bank (PDB)

http://www.pdb.org

RCSB Protein Data Bank H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N.

Bhat, H. Weissig, I. N. Shindyalov, P. E. Bourne, (2000). The Protein Data Bank,

Nucleic Acids Research, 28: 235-242

2) IMG/M

http://img.jgi.doe.gov/m/

IMG: the integrated microbial genomes database and comparative analysis system

Victor M. Markowitz1, I-Min A. Chen, Krishna Palaniappan, Ken Chu, Ernest

Szeto, Yuri Grechkin, Anna Ratner, Biju Jacob, Jinghua Huang, Peter Williams,

Marcel Huntemann, Iain Anderson, Konstantinos Mavromatis, Natalia N. Ivanova

and Nikos C. Kyrpides

3) Clustal W

Websites for Clustal W multiple alignments:

Dendrogram 1 & 3: http://pir.georgetown.edu/pirwww/search/multialn.shtml

Dendrogram 2: http://www.genome.jp/tools/clustalw/