leslie a. knapp department of biological anthropology university of cambridge · 2008-11-14 ·...

42
Molecular Methods in Anthropology Module Leslie A. Knapp Department of Biological Anthropology University of Cambridge i Cengage Learning

Upload: others

Post on 04-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Molecular Methods in Anthropology Module

Leslie A. Knapp Department of Biological Anthropology

University of Cambridge

i

Cengage Learning

Page 2: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Table of Contents page Introduction and Aims 1 Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources of DNA and Biological Sample Collection 5

Invasive versus non-invasive sampling 5 Ancient/archival DNA 7 DNA Extraction 8 In The Lab 1: DNA Extraction 8 Methods and Application of Molecular Hybridization 9 In The Lab 2: DNA-DNA hybridization 10

Restriction Fragment Length Polymorphisms (RFLPs) and DNA Fingerprinting 10 In The Lab 3: Southern Hybridization 11 Principles and Applications of the Polymerase Chain Reaction (PCR) 12 In The Lab 4: PCR 12 Allele-specific PCR 13 Advantages and disadvantages of PCR 13 Gel electrophoresis 14 In The Lab 5: Gel Electrophoresis 14

DNA sequencing 16 In The Lab 6: DNA Sequencing 16 Repetitive DNA Sequences 17 Dispersed repeats 17 Clustered repeats 18 Identifying individuals with microsatellites 18 DNA-based Trees and Evolution 20 Species trees versus gene trees 21 How are we related to the Neandertals? 22 Protein Structure and Function 24 Protein structure 24 The functional diversity of proteins 24

mRNA studies 25 BOX 1: MHC genes, immune response and evolution 25 Recombinant DNA Technology and Human Evolution 27 In The Lab 7: DNA cloning 28

ii

Cengage Learning

Page 3: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Acknowledgements 29 Suggested Discussion Questions 29 Bibliography and Suggested Readings 29 Glossary 32

iii

Cengage Learning

Page 4: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Preface

This supplement is intended to accompany your Wadsworth Anthropology textbook. The enclosed printed copy is provided as a courtesy with the purchase of your book. To best utilize this supplement, please go to the on-line version at the following web address: http://www.wadsworth.com/anthropology_d/special_features/ext/molecular_methods/ user name: nucleus password: cytoplasm The online version includes color photographs, hot-linked chapter topics, and live weblinks. Plus, you can answer the questions for thought at the end of the chapter online and email your responses to your instructor.

iv

Cengage Learning

Page 5: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Introduction and Aims

Genetic methods have always played an important part in physical anthropology. Historical figures such as Galton in the 1800s and Landsteiner in the early 1900s represent some of the earliest pioneers of anthropological genetics. Laboratory methods for identifying human blood types such as ABO have been available since 1901, when Landsteiner described this well-known blood group system. Not long after this, laboratory methods for identifying disease-causing genes became available to medical and biological scientists. In Norway, biochemical techniques were developed in the 1930s to determine if newborns had an inherited disease called phenylketonuria (PKU), which could seriously damage the developing nervous system. Eventually, however, biochemical approaches were replaced by techniques that could be used to examine DNA directly, at the molecular level.

The same molecular biology techniques used in medicine can be used to study

human variation and, as a consequence, anthropologists now use molecular biology techniques to study humans and to explore evolutionary relationships in humans and their primate relatives. In some cases, molecular methods have been used to supplement fossil evidence. In other cases, molecular biology techniques have changed the way in which anthropologists study modern humans and nonhuman primates.

This module explores how molecular anthropologists use genetic methods and

applications to study genetic variation and evolution in humans and nonhuman primates. Accordingly you will learn about some of the common laboratory methods being used to explore these topics in ways that would have been impossible even 10 years ago. Detailed laboratory protocols can be found in many manuals and papers, including references cited at the end of this module. Specific examples will be drawn from up-to-date research on human evolutionary origins and comparative primate genomics1 to demonstrate that scientific research is an ongoing process with theories frequently being questioned and re-evaluated. Molecular Anthropology and the Human Genome Molecular and biochemical studies of human variation began in the early 1900’s, but the practical application of biochemistry and genetics to the field of anthropology did not begin until the 1960’s. “Molecular anthropology,” as it was originally named by Emile Zuckerkandl in 1962, described the use of biochemistry to understand human evolution. Since that time, molecular studies of human diversity and evolution have expanded to include molecular genetic investigations of human variation and evolutionary relationships within and between humans and nonhuman primates, as well as the application of molecular genetics to the study of human and nonhuman primate behavior.

1 See glossary at end of module for definitions of all terms in bold face.

1

Cengage Learning

Page 6: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

In the 1980’s the Human Genome Project set out to obtain a complete description of the human genome by determining the precise DNA sequence of all 46 human chromosomes. The rationale was that fundamental information concerning our genetic make-up would help us understand the role of genes in health and disease, while furthering our scientific knowledge of human genetics in general. Although the Human Genome Project began thorough the support of the U.S. Department of Health and Human Services, it has now become a major international project and complementary research programs have been established in the United Kingdom (England, Scotland and Wales), France, Japan and many other nations. Coordination of these international efforts has been undertaken by the Human Genome Organization (HUGO), which facilitates discussions of the ethical, legal and social issues of human genome research.

Another of the major goals of the Human Genome Project, and later HUGO, is to study the genomes of nonhuman organisms to determine similarities that may help in understanding health, disease and even evolution. Not surprisingly, most of the effort on nonhumans has been focused on model organisms, such as mice, fruit flies and yeast. However, some researchers have argued that detailed studies of our closest relatives, the primates, are also needed. Currently, efforts are being directed toward particular nonhuman primates such as chimpanzees and rhesus macaques. There is still a great deal to learn about our own genome, as well as that of our close relatives.

Nuclear DNA

Although the precise order of nucleotides and the relative importance of some regions of DNA in humans has yet to be determined, a great deal is known about what makes up the human genome. Based on the hard work of many scientists throughout the world, we now have a blueprint of the human genome. Generally, we know that the human genome is made up of two basic components (Figure 1). The nuclear genome (the genetic material contained in chromosomes) contains DNA inherited from both parents. Nuclear DNA is found only in the cell nucleus and, as a rule, each cell contains just one copy of the nuclear genome, which is made up of approximately 3 billion nucleotide or base pairs (bp), organized into chromosomes. Surprisingly, only about 20% of the nuclear genome consists of genes and gene-related sequences. We know that genes contain protein-coding segments called exons. But they also contain non-coding regulatory regions and introns. As a consequence, more than 90% of the gene and gene-related sequences is considered non-coding DNA. Non-coding sequences also include pseudogenes, which are genes with deletions, insertions or mis-sense mutations that interfere with the gene’s function. Gene fragments may also arise from unequal crossing-over during meiosis. An even larger part of the nuclear genome consists of what is known as extragenic DNA, repetitive or unique sequences that do not, at present, seem to contain protein-coding information. These sequences are composed of repeated strings of nucleotides, some of which are dispersed throughout the genome while others are clustered together. (The repetitive sequences are especially numerous and alone make up more than 40% of the nuclear genome; we will discuss them in greater detail later.)

2

Cengage Learning

Page 7: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Some of these repetitive sequences have been used to study evolution and to determine paternity and relatedness in humans and other primates.

The Human Genome

mtDNA 37 genes Nuclear DNA

3 billion bp

Genes and gene-related

sequences (20-30%)

Coding (<10%) Non-coding (>90%)

Extragenic DNA

(70-80%)

Repetitive (20%)

Unique sequences (80%)

Figure 1. The Human Genome is composed of the mito-chondrial and nuclear genomes.

Mitochondrial DNA (mtDNA)

The second basic component of the human genome is found in the mitochondria, the small energy-producing organelles in the cytoplasm of a cell. Mitochondria contain their own DNA (also called the mitochondrial genome) and they are very similar in humans and nonhuman primates. Mitochondrial DNA (mtDNA) is double-stranded, like nuclear DNA. However, it differs from the nuclear genome in that it forms a closed ring instead of being organized into chromosomes. Almost 90% of this circular molecule, comprising approximately 17,000 base pairs, is made up of protein coding sequences. The mitochondrial genome includes 24 genes that code for the production of ribosomal RNA and 13 that code for proteins required for energy production. There are no introns, few repetitive sequences and just one non-coding sequence, which initiates transcription and replication of the mitochondrial genome. Other critical differences between the nuclear and mitochondrial genomes include the pattern of inheritance: mtDNA is inherited only from the mother in the egg’s cytoplasm; recombination does not occur in mtDNA because mtDNA is inherited as an identical copy of the mother’s; and the presence of multiple mtDNA copies in every cell, since there are many mitochondria in all cells.

3

Cengage Learning

Page 8: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

The differences in the structure and composition of genes, patterns of inheritance and number of copies of the nuclear and mitochondrial genome in each cell make it possible for scientists to study genetic relationships between distantly, or closely, related organisms. For related species, such as Old World monkeys, apes and humans, molecular geneticists often study regions of the genome that evolve relatively slowly. These regions are usually protein coding DNA sequences that are shared by different species due to common ancestry. Genes like these are considered homologous, or shared due to inheritance from a common ancestor.

Most species, if they are not too closely related, can also be studied using parts of the genome that accumulate mutations at a fairly constant “clock-like” rate. Pseudogenes and non-coding DNA sequences are said to accumulate neutral mutations and, as a consequence more closely related species should have fewer mutations than more distantly related species. (In other words, pseudogenes and non-coding sequences should be more similar in closely related species than they are in more distantly related ones.) Based on detailed studies of homologous DNA sequences in different species, it is now clear that rates of mutations are different in different lineages and at different times in an organism’s history. The rates of change can be influenced by natural selection operating on the genes themselves or on nearby regions of the genome. Rates of change are also affected by generation length, rates of DNA repair and even the nucleotide sequence itself.

Although “molecular clock” studies of modern humans suggest that

“mitochondrial Eve,” our common female ancestor, lived approximately 200,000 years ago in Africa, recent studies in other species indicate that molecular clocks do not work perfectly. Based on comparative studies in different species, it is now clear that molecular clocks tick at different rates in different species and at different times. For example, one of the genes involved in hemoglobin production called alpha globin has evolved 10 times faster in baboons than in rhesus macaques (Shaw et al, 1989). Another example comes from langurs, when compared to other Old World monkeys, there has been a 2.5-fold increase in the rate of nucleotide substitutions in the gene for a digestive enzyme (Messier and Stewart, 1997).

When organisms are quite closely related, it may be difficult to identify significant evolutionary differences between individuals. Therefore, scientists often choose to study regions of the genome that accumulate mutations rapidly. For example, if we want to investigate evolutionary relationships between different human populations, rapidly mutating regions of the mitochondrial genome can be studied. Rates of divergence are 1.5 to 5 times greater in these segments than they are in protein coding genes. The mitochondrial genome is also useful for evolutionary studies because we have a complete picture of the DNA sequence and order of genes in human mitochondria. Furthermore, there is very little difference between the mitochondrial genomes of humans and most nonhuman primates. Thus, although comparisons can also be made between humans and other primate species using mitochondrial genes, the example of langur digestive enzymes and baboon alpha globin demonstrate that molecular clocks do not

4

Cengage Learning

Page 9: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

have the same rate in all species. As a consequence, divergence dates derived using molecular clocks should be accepted with caution (see Graur and Martin, 2004). Sources of DNA and Biological Sample Collection

One of the most hotly debated issues in anthropology has been the origin of anatomically modern humans. Since the fossil record for this time period is not complete enough to support any one scenario, two contrasting models were proposed. The multi-regional model argues that ancestral populations of Homo erectus gave rise to all modern Homo sapiens in the Old World. According to this model, modern humans would be genetically very similar to one another since there would have been extensive gene flow for a very long time. Alternatively, the single origin model argues that Homo sapiens originated as a single population in Africa and, as a consequence, only African populations would exhibit extensive genetic variation.

While fossil evidence supporting one of these two models may be discovered

eventually, the study of DNA in modern human populations can provide new insight into recent evolutionary events in human history. Mitochondrial DNA from Africans, Asians and Europeans lend support for the Out of Africa model in two ways. First, modern human genetic variation is generally small. Second, Africans display the greatest degree of genetic variation.

Studies of DNA can also be used to examine more ancient evolutionary events,

such as the divergence of apes and humans. Using DNA from modern humans and apes such as chimpanzees, gorillas and orangutans, anthropologists have demonstrated that humans and chimpanzees are more closely related to one another than either is to gorillas or orangutans. The DNA data can also be used to construct a molecular clock that estimates the human/chimpanzee divergence at a little more than 5 million years ago.

These two examples demonstrate how molecular studies provide important insight

into human and primate evolutionary history. As you will see, studies of DNA are also useful for identifying individuals and for determining how closely related individuals are within modern human and nonhuman primate populations. However, the success or failure of molecular genetic analyses depends on collecting the most suitable biological sample and storing it in such a way to minimize damage or degradation. In general, a biological sample is simply a specimen that contains nucleated cells and, therefore, DNA. All cells with a nucleus contain a copy of an individual’s genome, but, all samples are not equally useful for molecular genetic studies. Invasive versus non-invasive sampling

For most studies, an ideal sample would be fresh blood or tissue since they contain relatively large numbers of nucleated cells and yield high concentrations of good quality DNA. Whole blood samples contain red and white blood cells and platelets suspended in a watery fluid called plasma. Plasma, which makes up more than 50 percent of total human blood volume, contains proteins required for clotting and

5

Cengage Learning

Page 10: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

immunity, but these substances do not contain DNA. Neither do red blood cells or platelets (essential for blood clotting). But the white blood cells, well known for their role in immune defense, do contain nuclei with DNA and they are excellent sources of DNA.

Whenever blood samples are collected for genetic studies, the blood must be

mixed with anti-coagulants to prevent clotting because when blood samples clot, it is difficult to separate out the white cells. However, the problems associated with clotting can be avoided when tissue samples are available. Nearly all tissue (such as liver, muscle or skin) cells contain a nucleus, with a complete copy of an individual’s genome, and therefore, they can provide abundant sources of DNA. Unfortunately, blood and tissue samples (derived thorough invasive techniques) are often difficult to obtain except when researchers are based in close proximity to clinical settings, where blood and tissue samples can be obtained safely and without discomfort to study subjects. When studying nonhuman primates in the field, researchers must sedate the animal, collect the blood or tissue sample and then ensure that the animal will recover in a safe location, away from predators or other dangers. Blood and tissue samples also require proper storage in refrigerators or freezers to prevent degradation of DNA. In field studies of human populations, blood samples can be obtained by pricking a subject’s finger with a needle and collecting the blood sample on sterile filter paper. This approach eliminates the need for immediate refrigeration, but typically yields low concentrations of DNA and often causes minor discomfort to study subjects. Non-invasively collected samples generally do not contain large numbers of nucleated cells, but they are usually much easier to collect. In the field it is often possible to obtain cells scraped from the inside of a subject’s cheek. Often, cheek cells yield poor quality DNA due to the presence of salivary enzymes that breakdown cells in the mouth. Particular types of foods, drinks and activities (such as gum chewing or smoking) also have a negative effect on DNA yields. Interestingly, cheek cells can also be collected from nonhuman primates by rinsing the surface of wads of vegetation (called wadges) that have been chewed and spit out. These samples have been particularly useful for studies in chimpanzees.

Hair follicles also contain nucleated cells and these provide another way to collect DNA-containing samples without much discomfort for study subjects. Ideally, hairs are plucked to ensure that cells are numerous and fresh. Hairs that have been shed are a poor source of DNA since only a small number of cells are attached to hair shaft and these cells are in the process of degradation. There are also problems with contamination since there can be more DNA from the individual collecting the sample than from the individual that shed the hair. Wearing gloves during sample collection may reduce the possibility of contamination, but molecular genetic studies of shed hairs are notoriously difficult. Thus, even though this approach avoids disruption of the study subjects, it requires a great deal of effort to obtain accurate results. Consequently, new techniques for using shed hairs for molecular genetic research are still in development.

6

Cengage Learning

Page 11: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

For studying many nonhuman primates, it is not possible to obtain food wadges or hairs. Instead, researchers can only collect waste products from animals. Urine and feces may seem useless, but actually they contain significant numbers of nucleated cells from the individual. Also, urine and feces are plentiful and they are ideal for those who do not want to disturb their study subjects or cannot get close enough to collect any other type of sample. Urine samples can even be caught in containers in mid-air from animals overhead! (Give this a try some time if you are feeling adventurous.) And, although it is not nearly as exciting, fecal samples can be collected directly off the ground. Scientists who use feces suggest that the surface of the sample provides the most cells. However, when the animal is a carnivore, there may be contamination from the cells of digested prey animals.

Recently there has been an increase in the number of studies relying on urine and

fecal samples, even though many researchers emphasize the difficulties in obtaining DNA and repeatable results from these sources. Carefully controlled DNA-based studies of chimpanzee feces have shown that the low DNA content of fecal samples can lead to incorrect results (see Morin et a.l, 2001). Additionally, the presence of microbial DNA in wild gorilla feces can lead to parent-offspring mismatches and even incorrect paternity determinations (see Bradley and Vigilant, 2002).

Ancient/archival DNA

Archival samples, such as skins and teeth from museum collections, may provide DNA for molecular genetic studies, but they present problems that are similar to those described for non-invasively collected samples. First, many museum skins have been treated with preservatives that destroy cells and degrade DNA. Second, even when skins have not been treated, much of the DNA has been degraded by high storage temperatures and normal aging over time. Researchers may also find DNA-containing cells within the pulp or dentin of teeth, but this material degrades rapidly and few cells will contain significant amounts of nuclear DNA.

More ancient samples, such as Neandertal or even modern human bone, pose

similar problems since very few cells may be found in this material and most of the DNA will be highly degraded. Moreover, the problem of contamination from the researchers themselves are even more exaggerated than with non-invasively collected samples like shed hair. To avoid contamination, studies of ancient DNA must be conducted in isolated, specially sealed rooms that do not allow the introduction of modern DNA since only minute amounts of ancient DNA can be obtained from these samples. Few laboratories have the space, or resources, to undertake studies of ancient samples, and those that do frequently bemoan the costs and difficulties associated with these studies.

In addition to the scarcity of DNA, archival and ancient samples are problematic because the sample is destroyed during the DNA extraction process. Consequently, the museums entrusted with the protection of these valuable and unique specimens must impose very strict guidelines and restrictions. To deal with the problems and limitations

7

Cengage Learning

Page 12: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

involved in the use of these samples, new techniques are currently being developed in a number of laboratories throughout the United States and in other countries. Whatever type of biological sample is collected, permits for collection and transport will be required. For example, because of ethical concerns, permission must be obtained to collect human samples. This may involve a comprehensive risk assessment, a review of the research project’s aims and a plan for obtaining informed consent from all study subjects. In the United States, scientists studying the genetic history of human populations must obtain meaningful informed consent from people who donate DNA samples and there can be no record of medical or personally identifying information about the donors. In the United Kingdom, the recently developed Human Tissues Act also requires that researchers obtain permission for every specific use of a biological sample. When blood, tissue or hair samples are collected from primates, export and import permits must be obtained by both local government agencies and through the Convention on International Trade in Endangered Species (CITES) of Wild Fauna and Flora. Strict regulations by CITES aim to prevent smuggling of animal parts and, as a consequence, scientists involved in research on animals, particularly endangered species, must justify their research and the need for biological samples such as blood, hair or feathers. Importation of fecal and urine samples are not so strictly regulated and, as a rule, researchers must only obtain permits from agricultural officials. DNA extraction As you have already learned, although DNA can be extracted from a variety of sources, white blood cells are generally the best sources of DNA. But today, using various chemical and physical methods, scientists can obtain DNA from almost any biological sample (see In The Lab 1). In The Lab 1: DNA Extraction

If whole blood is treated to prevent clotting and then permitted to stand in a container, the red blood cells, which weigh the most, will settle to the bottom and the plasma will remain at the top. The white blood cells and platelets will remain suspended between the plasma and the red blood cells. A centrifuge (see Figure 2), a device that spins the tubes at extremely high rates of speed, may be used to hasten this separation process. White cells can be centrifuged out, then mixed in a buffered, soapy, saline solution that breaks down the fatty cell membrane, and splits the cells open to release their DNA-containing nuclei. The nuclei, in turn, can then be broken open, with more soapy solution to dissolve the nuclear membrane. When a high salt concentration solution is then added to the mixture, the DNA dissolves. Eventually, the DNA is precipitated out of solution by adding cold ethanol and, as a consequence, it condenses, becomes visible, and looks like pieces of whitish thread (see Figure 3). The precipitated DNA is then collected using a sterile glass hook, dried, and placed in sterile water to dissolve into solution. This procedure can yield strands of nuclear and mitochondrial DNA and many researchers use this technique to extract both types at the same time.

8

Cengage Learning

Page 13: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

In The Lab 1: DNA Extraction (continued)

Figure 2a-b: A centrifuge separates substances with different densities. The centrifuge in Figure 2a (close-up in 2b), used to spin small volumes of solutions (<2 millilitres) contained within small tubes, is known as an ultra-centrifuge. Other centrifuges can be used to spin large volumes, up to 50 millilitres.

Figure 3: Genomic DNA is precipitated in ethanol. While individual strands of DNA are not visible by eye, large quantities will condense during extraction and will be visible as a large white mass (see arrow left).

DNA is visible as white mass

Methods and Application of Molecular Hybridization A technique called DNA hybridizaton has been used for at least 20 years to estimate genetic distance between humans and nonhuman primate species. DNA-DNA hybridization relies on the double-stranded (i.e., duplex) nature of DNA and the fact that nucleotides on the two complementary strands are held together by hydrogen bonds. Specifically, adenine pairs with thymine using two hydrogen bonds and guanine pairs with cytosine using three hydrogen bonds. When DNA is heated to temperatures greater than 94oC, the hydrogen bonds, the weakest links in DNA, are broken and the two DNA strands separate. Thus, this process (called denaturation) will yield two intact, but separate, single strands. When the single strands are cooled, complementary nucleotides will rejoin or anneal to form into double-stranded duplexes. Duplexes formed from a single sample will have perfectly matched complementary strands, called homoduplexes.

9

Cengage Learning

Page 14: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

However, when duplexes are formed from the DNA of two different species incorrect nucleotide pairing will occur, resulting in the formation of heteroduplexes.

In a classic study, Sibley and Ahlquist (1984) used DNA-DNA hybridization to assess the evolutionary relationships between humans, chimpanzees and gorillas. They reported that hybridization experiments demonstrated that humans and chimpanzees are more closely related to one another than either is to gorillas. Although DNA-DNA hybridization has improved our understanding of genome structure in many species, including primates, the approach has been criticized due to the fact that the technique does not provide specific detail about nucleotide mismatches and genome size. But, at the same time, recent studies have supported the work of Sibley and Ahlquist (see Li et a.l, 1987 and Wildman et a.l, 2003). In The Lab 2: DNA-DNA Hybridization

The human and non-human primate genomes contain large segments of repetitive DNA and these sequences reassociate most rapidly after denaturation. For DNA-DNA hybridization studies, repetitive DNA is removed through chemical means and the remaining sequences are mixed under conditions that favor duplex formation. When mixtures from different species are created, it is possible to determine the degree of genetic divergence between the species since heteroduplexes containing the largest number nucleotide mismatches will disassociate most rapidly. The difference in temperature at which heteroduplexes denature subtracted from the temperature at which homoduplexes denature, the delta-T, yields an overall estimate of genetic difference between two species. Closely related species, with very similar DNA, will have fewer mismatches while more distantly related species, with different DNA, will have more. Restriction Fragment Length Polymorphisms (RFLPs) and DNA Fingerprinting The discovery of restriction endonucleases, which can cleave duplex DNA composed of particular nucleotide sequences, revolutionized molecular genetics. One well-known example of a restriction endonuclease is EcoRI (named after the bacterium Escherichia coli, from which it was first isolated). EcoRI cuts double-stranded DNA wherever the nucleotide sequence GAATTC occurs (Table 1). Several hundred restriction endonucleases have been isolated from bacteria to identify different nucleotide sequences. Cutting (or restricting) DNA with endonucleases enables scientists to detect differences in nucleotide sequence (polymorphisms) between individuals that result from substitutions, insertions, deletion or rearrangements of DNA. The restriction fragment length polymorphisms (RFLPs) can then be separated and visualized using gel electrophoresis (see below). After the initial RFLP assay, it is possible to use hydridization to detect a single gene, a specific stretch of nucleotides or even a repetitive sequence. This technique is known as “Southern hybridization” (see In The Lab 3).

10

Cengage Learning

Page 15: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

In The Lab 3: Southern Hybridization

Following gel electrophoresis (see p. 14), the duplex DNA fragments are chemically denatured and fixed onto a nylon membrane where they are exposed to a single-stranded “probe” under conditions that allow complementary nucleotide sequences to form duplexes. Usually, the conditions allow duplex formation only between the “probe” and its complementary sequences. Detection of hybridization involves the use of radioactivity or fluorescence. If the probe represents a single, unique nucleotide sequence, only one or two fragments will be identified. Contrastingly, probes representing repetitive sequences may yield very complex fragment patterns that are also known as “DNA fingerprints” due to the fact that most individuals will possess complex restriction endonuclease/probe patterns.

The term “DNA fingerprintin

by Alec Jeffreys, of the University orestriction endonuclease/probe pattemonzygotic twins). Originally applibe used to identify unique DNA pattbirds and fish. The complex patternmeans of establishing genetic identiassessing parentage since DNA fingpaternal contributions.

Unfortunately, the complex p

to interpret and the assignment of pacurrently possible. Also, DNA-baseweaknesses have been challenged inO.J. Simpson trial. When DNA testwere used to compare droplets of bl

Cengag

Restriction endonuclease

Recognition sequence (cut site=⇓)

Double-stranded DNA after restriction endonuclease cutting

CltI (Caryophanon latum)

GG⇓CC GG CC ⏐⏐ ⏐⏐ CC GG

EcoRI G⇓AATTC G AATTC

Table 1: Restriction endonucleases cut specific double stranded DNA sequences. These two examples show how the cuts can be straight (CltI) or staggered (EcoRI), depending on the DNA sequence.

Figure 4: Most

(Escherichia coli)

⏐⏐⏐⏐⏐ ⏐⏐⏐⏐⏐ CTTAA G

gf rneders tyer

artd cs, oo

rning

1 2 3 4 5 6 7

individuals will have different DNA fingerprints. Seven related pigtailed macaques (lanes 1-7, left) have similar, but not identical, DNA fingerprints when “probed” with a repetitive sequence.

” is usually associated with a technique introduced Leicester in the United Kingdom, in which unique s distinguish most or all individuals (except to the study of humans, Jeffreys’ probes can also ns in species that range from apes and monkeys to characteristic of DNA fingerprints provide a simple in forensic and criminal investigations, as well as print patterns are the product of maternal and

tterns of DNA fingerprints also make them difficult icular patterns to specific genetic loci is not forensic analyses are not foolproof and many ourt. One famous example comes from the 1994 conducted by the California Department of Justice, d found at the crime scene and Simpson’s own

11

e Lea

Page 16: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

blood, it was reported that there was so much similarity that only 1 person in 57 billion could have produced an equivalent match. Similar analyses at other crime labs confirmed the results, but Simpson’s lawyers were able to raise doubts about the blood storage and handling and they ultimately persuaded the jury that Simpson was “Not Guilty.” During the last 10 years, DNA testing has become more sophisticated and accurate and, as a consequence, DNA evidence has been used to convict criminals and exonerate incarcerated individuals – even defendants on “Death Row” (see below) . Principles and Applications of the Polymerase Chain Reaction (PCR) The PCR technique involves three steps that essentially mimic the natural process of DNA replication in vivo (see In The Lab 4). When all of the components of the PCR reaction mixture are together in one tube, the original template is melted and an emzyme called polymerase makes two new strands, doubling the amount of DNA present. This process is repeated 20 to 40 times, each cycle providing two new templates for the next cycle. Thus, the amount of amplification is 2 (the number of templates) raised to n power, where n represents the number of cycles that are performed. The polymerase reaction provides an extremely sensitive means of amplifying small quantities of DNA. The development of this technique resulted in an explosion of new techniques in molecular biology (and a Nobel Prize for Kary Mullis in 1993) as more and more applications of the method have been published. In The Lab 4: PCR First, double-stranded DNA is denatured into single strands by heating. Second, short stretches of single stranded DNA sequences (about 18-25 nucleotides long), called primers, are annealed to sites that flank the DNA sequence of interest. Third, using heat-resistant DNA polymerase, new complementary single strands of DNA are synthesized between the flanking primers using the original single strands as a template. Following just one round of these three steps, the targeted region of interest has been duplicated. Each repetition of the three steps doubles the number of copies of template and after 30 cycles up to a billion copies can be produced.

Start

1,073,741,824 copies

Cycle number

Number of DNA copies after cycle

1 2 2 4 ↓ ↓ 30 1,073,741,824

Figure 5: PCR is used to make many copies of a particular DNA sequence of interest. When a PCR experiment begins with one copy of a DNA sequence, there will be two copies after the first round of the PCR cycle, four copies after the second round and more than one billion copies after 30 rounds of PCR cycles. The illustration, at right, shows how copy number increases geometrically.

12

Cengage Learning

Page 17: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

In The Lab 5: PCR (continued)

Figurebecausisolatedcompu

Allele

PCR igiven nucleohemogthese d

Advan

extrem

C

a.

ing

b.

6a-c: PCR is a common technique ine reaction volumes can be minimal; b) t areas like this safety hood; c) each of t

ter-based temperature controls to raise a

-specific PCR

As a rule, PCR is used to identify ps possible when the primers are desigenetic locus. This approach is usedtides within the primer annealing sitlobin have a different DNA sequencifferences can be identified using al tages and Disadvantages of PCR

While a very powerful technique, Pely important for effective amplific

engage

c.

most genetic labs. a) PCR tubes are small, o avoid contamination, PCR reactions are set up in hese three thermal cycling machines uses nd lower temperature during PCR.

articular DNA sequences and allele-specific gned to anneal to just one DNA sequence at a to identify alleles that differ by one or more e. For example, individuals with normal e from those with the sickle cell mutation and lele-specific PCR.

CR can also be very tricky. Primer design is ation. That is, the primers for the reaction must

13

Learn

Page 18: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

be very specific for the template to be amplified. Cross-reactivity, with non-target DNA sequences, results in non-specific amplification of DNA. Also, the primers must not be capable of annealing to themselves or each other, as this results in the very efficient amplification of short nonsense DNAs. The reaction is also limited in the size of the DNA that can be amplified (i.e., the distance between the forward and reverse primers). The most efficient amplification is in the 300 - 1000 base pair (bp) range, however amplification of products up to 4,000 bases (4 Kb) has been reported.

The most important consideration in PCR, though, is contamination. If the sample

being tested has even the smallest amount of contamination from another source of DNA, the reaction could amplify the contaminating DNA and report a falsely positive identification. For example, technicians in a crime lab compare blood samples from suspects to samples taken from a crime scene. If there is any contamination from one sample to the other, the result could be the unfortunate and mistaken conviction of the suspect. Contamination can also occur when a few blood cells stick to the plastic surface of the pipette, and then get ejected into the test sample. For this reason, and many others, modern labs devote tremendous effort to avoiding this problem.

Gel Electrophoresis Gel electrophoresis is a common laboratory method used to quickly separate and visualize DNA fragments in molecular genetics laboratories. The usual electrophoretic media are agarose or acrylamide gels. These gels are dense matrices through which DNA fragments migrate when exposed to electric current. The technique of electrophoresis is based upon the fact that DNA is negatively charged at neutral pH due to its phosphate backbone. When an electrical potential is placed on the DNA it will slowly move towards the positive pole. In The Lab 5: Gel Electrophoresis

An agarose gel is a flat slab of jelly-like material, that forms a porous lattice, or matrix, through which DNA fragments must migrate in order to move toward the positive pole during electrophoresis. Larger molecules move more slowly than smaller ones, since the smaller molecules meet less resistance in the gel. As a result, a mixture of large and small fragments of DNA can be separated by size.

As discussed previously, agarose gel electrophoresis is used to detect DNA fragment sizes

following restriction endonuclease digestion. It is also used to separate digested DNA fragments prior to Southern hybridization using single-stranded DNA probes and as part of the process when generating DNA fingerprints.

Like agarose gel electrophoresis, acrylamide gel electrophoresis separates DNA fragments

according to size. Acrylamide gels are somewhat more difficult to work with than agarose gels, primarily because they are usually very thin (0.5-0.25mm thick), and are often electrophoresed in a vertical orientation. But, because these gels can be used to visualize differences as small as a single nucleotide in length, acrylamide gel electrophoresis is significantly more sensitive for detecting size differences in DNA fragments.

14

Cengage Learning

Page 19: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

In The Lab 5: Gel Electrophoresis (continued)

a.

Figure 7d (below): Experimental results from agarose gel electrophoresis are recorded as a photograph. A photograph of the agarose gel records experimental results for analysis. In this photo, DNA has been amplified using PCR and the amplified fragments show up as bright bands (see arrow, below, left). From left, the order of samples is: negative control with no DNA(-), positive control with DNA known to amplify using the ng

b.

present experimental conditions (+) and two different experimental samples (chimpanzee (C) and gorilla (G)). Sizes of fragments are estimated by comparing results with standardised fragments of known size on the far right of gel (S). For this standard, the uppermost band is 1100 base pairs (bp) long and the brightest band is 500bp. Learni

c.

C G S +_

500bp

1100bp

Figure 7a-c (above): Agarose gel electrophoresis is used to visualize DNA. a) Agarose gels are formed using a horizontal mold; b)once polymerised, DNA samples are loaded into pre-formed wells in the gel; c) after electrophoresis and staining, DNA fragments are visualized over UV using a transilluminator.

15

Cengage

Page 20: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

DNA Sequencing

DNA sequencing is the process of determining the exact order of nucleotides that

make up a DNA segment. As discussed in the beginning of this module, determining the precise order of the 3 billion nucleotides that make up the DNA of the 24 different human chromosomes has been one of the major aims of the Human Genome Project. However, most scientists have more modest aims that involve studying a single gene, or region of a gene, and this requires the determination of only several hundreds or thousands of nucleotides. For example, the gene that produces hemoglobin in humans is 1,652 nucleotides long and individuals with a single point mutation (T A at amino acid position #6) will produce sickling hemoglobin. Thus, by determining a person’s DNA sequence for the hemoglobin gene, scientists can tell if the person will produce normal or sickling hemoglobin.

If anthropologists want to estimate relatedness between species using DNA, then

they need DNA sequences, since the estimates themselves are based on the number of nucleotide differences between the species being compared. For example, DNA sequences from a non-coding segment of the hemoglobin genes have been used to support the close evolutionary relationship between humans and chimpanzees. When Li et al. (1987) compared 5,300 nucleotides from humans, chimpanzees and gorillas, they found that humans and chimpanzees differed by 77 nucleotides, humans and gorillas differed by 79 nucleotides and chimpanzees and gorillas differed by 83 nucleotides. Therefore, it was concluded, humans and chimpanzees shared a common ancestor more recently than either did with gorillas.

In The Lab 6: DNA Sequencing

Several methods can be used to determine the sequence of nucleotides in a purified DNA segment. The Sanger method, developed by Fred Sanger at the University of Cambridge, is the technique of choice today. It involves the controlled interruption of in vitro DNA replication, not unlike the polymerase chain reaction described in an earlier section. It begins with the denaturation of double-stranded DNA and annealing a short single-stranded segment of DNA called a primer. As in PCR, the primer must be complementary to a template DNA sequence in order to anneal and initiate the synthesis of a new single strand of DNA. For the purposes of DNA sequencing, some of the nucleotides will have fluorescent or radioactive particles attached to them so the newly synthesized strand of DNA will also consist of fluorescent or radioactively labeled nucleotides. The newly synthesized sequences are then separated using gel electrophoresis, visualized by laser (for fluorescent particles) or autoradiography (for radioactive particles) and the DNA sequence is read directly. Figure 8 displays an image of a computerized DNA sequence output (also known as an electropherogram).

C G G G C G G T G A C A G A G C T G G G G C G G C C

C G C G G G A C A G A G C G G G G C G G C CG G T T

16

Cengage Learning

Figure 8: Electropherograms show DNA sequence results from automated sequencing. This electropherogram depicts 26 nucleotides of a DNA sequence from a human gene called HLA-DRB1 (see page 26 for more on HLA).

Page 21: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Repetitive DNA sequences You have already learned that a large proportion of the human genome consists of non-coding DNA. In many other species, including non-human primates, the proportion of repeated sequences is equally high. There are two basic classes of repetitive DNA sequences in the human genome: dispersed repeats and clustered repeats. Dispersed repeats are scattered throughout the genome and are characterized as either short or long. Clustered repeats are also widely distributed throughout the genome, but they are generally shorter than dispersed repeats. Clustered repeats fall into three basic categories: satellites, mini-satellites and microsatellites. They are called “satellites” because scientists discovered them by noticing that centrifuged DNA sometimes settled out into two or more layers. The main layer, or band, contained coding sequences, but the others were repeated sequences that researchers named “satellites” (Moxon and Wills, 1999). The further designations (mini- and micro-) refer to the length of the segments. Dispersed repeats

Short Interspersed Nucleotide Elements (SINEs). SINEs are highly abundant in mammalian genomes. In humans, one type of SINE, Alu, is particularly well known. This repetitive sequence is approximately 300 nucleotides long and about 500,000 copies have been identified with a copy occurring approximately every 5-10,000 nucleotides. It is called Alu because scientists initially discovered it using a restriction endonuclease called AluI. AluI cleavage sites flank the repetitive sequence, which is typically a pair of repeats (141bp each) with sequence similarities to a human RNA gene.

All Alu sequences are not exactly the same because nucleotide substitutions occur

in about 10 percent of the sequence. The differences suggest that Alus have been derived through repeated copying and moving (transposing) of the original, and later copies of the, RNA gene. Alu repeats are frequently used as natural markers for studying genetic rearrangements that indicate genetic variability and heritable disorders in humans. Similar Alus have been identified in other primate species and these sequences are useful for determining evolutionary relationships between species and also regions of the primate genome. As with DNA-DNA hybridization studies, greater similarity in Alu sequences suggests closer evolutionary relationship between species. Other SINEs, which have been identified in the genomes of primates and other mammalian species, are called MIRs (mammalian-wide interspersed repeats). In primates, an estimated 300,000 copies of MIRs are discernable.

Long Interspersed Nucleotide Elements (LINEs). LINEs can range from 14,000 to 61,000 bp in length and more than 50,000 copies have been identified in the human genome using restriction enzymes. LINEs are ubiquitous in the mammalian genome and, thus, are informative for evolutionary studies that involve many different primate species. Unlike SINEs, however, some members of LINE families may be able to copy themselves and, thus, they are sometimes called “jumping genes.”

17

Cengage Learning

Page 22: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Clustered repeats

Clustered satellite sequences are usually found in specific regions of chromosomes. For example, some are located near the centromeres and others are found near the ends of chromosome strands. As a rule, these sequences are just long strings of the same nucleotide repeated over and over. Many molecular geneticists argue that these sequences have structural importance since they do not code for proteins and are found in regions that suggest structural roles. About 10% of human microsatellites are found within genes and a number of human diseases result from abnormally large numbers of triplet repeats within genes. These diseases include Huntington’s disease, myotonic dystrophy and fragile X syndrome. In each case, disease severity and age of onset depends on the length of the triplet repeat. The longer the length, the earlier and more severe the disease will be. Most human chromosomes contain moderately long, tandemly repeated DNA sequences that can be detected using restriction enzymes, Southern hybridization and/or PCR. These variable number tandem repeats (VNTRs), first described in 1985, are also known as minisatellites. As explained earlier, VNTRs underlie the principle of “DNA fingerprinting,” since individuals differ in the number of VNTRs they possess, generally from one to 30 tandem repeats, of a 15-70 bp core sequence. Interestingly, many human minisatellite sequences have been identified in nonhuman primates and the same techniques used to study humans can be applied to the study of nonhuman primates. Shorter repeated DNA sequences, called microsatellites have the same basic structure as VNTRs, but the tandem repeat sequences are only 2 to 4 nucleotides long. More than 10,000 microsatellites have been discovered in the human genome and, as a rule, they are rarely more than 300 bp in length.

Most, but not all, microsatellite repeats are found outside the coding regions of the genome and any increase or decrease in repeat number should hypothetically have no major effect on the fitness of an individual. Therefore, microsatellite repeats are highly variable, with as many as 12 to 15 different repeats per “locus.” This extraordinary variability, combined with the fact that they can be studied using PCR (therefore requiring little DNA) makes microsatellites very popular genetic markers for disease association studies, forensic investigations and paternity determination.

Identifying individuals with microsatellites The accurate genetic identification of individuals has, until recently, been very difficult to achieve due to problems related to the limited availability of biological samples in forensic settings. Another problem has been the general lack of adequate variation, between individuals of the same species, needed to identify the genetic uniqueness of individuals. Before molecular genetic techniques were developed, people could be identified according to their particular ABO blood type and a combination of serum proteins. In the 1980’s these genetic markers were replaced, to a certain extent, by DNA fingerprinting with minisatellites. These markers exhibited more variability and, therefore, made it more likely that most individuals would have unique genetic profiles.

18

Cengage Learning

Page 23: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

There were, however, some disadvantages to using minisatellites but most of these problems and limitations were overcome once microsatellites were discovered.

Because microsatellites are so abundant in the human genome, scientists can examine several different, and highly variable, loci to identify a unique combination of microsatellite alleles for any individual. To illustrate the power of these markers, consider the scientist who determines the microsatellite alleles for four different loci, each known to exist on a different chromosome. Since the probability of having a particular combination of alleles at each locus is independent due to the Mendelian Law of Independent Assortment, it is highly unlikely that two individuals will share the exact combination of alleles unless they are unrelated. In fact, the probability of matching 4 loci with allele frequencies of 0.11 to 0.40 will be 1 in 7450 for unrelated individuals and 1 in 1750 for first cousins.

The same type of calculations can be used to assess paternity and determine

relatedness. These calculations depend on the degree of heterozygosity for each microsatellite locus and the number of loci that are used for the assessment. As a rule, it is easier to assess paternity when the microsatellite alleles of the mother are known because then it is usually clear which allele the offspring inherited from the father. If it is possible that more than one male may be the father, potential fathers can be excluded if they do not possess the paternal allele. When a number of loci are studied, it is often possible to assess paternity with confidence. This approach is currently being used in paternity studies, not only of humans, but also of nonhuman primates (e.g., chimpanzees and baboons).

Interestingly, many human microsatellite loci exist in the genomes of other

primates. Not surprisingly, a large number of human microsatellites can be used when studying chimpanzees. Indeed, most of the ground-breaking molecular genetic studies of chimpanzees in the wild have identified variation between individuals for paternity assessment using human microsatelllites. Rhesus macaques also possess many of the same microsatellites as humans (see Figure 9), making it possible to study diversity and relatedness in wild and captive groups of macaques. In more distantly related species, however, the number of, and variability in, microsatellites shared with humans is limited and it is often necessary to use complex and labor-intensive genome screening techniques to identify new species-specific microsatellites.

Cngage Learning

1

2e

3 4

5

6

7

Figure 9: Rhesus macaque paternity and relatedness can be determined using human microsatellies. This acrylamide gel shows the pattern of autosomal inheritance of microsatelliltes in rhesus macaques. The pedigree (at left) shows mothers (●) and fathers (■) at top and offspring below. Note how offspring inherit one allele from each parent according to Mendelian Laws. For example, individual #1 passed one allele to her daughter (#5). Individual #5 is a heterozygote and she passed a different allele to her daughter (#7).

19

Page 24: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

DNA-based Trees and Evolution Molecular genetics has had a major impact on the way we view the evolutionary relationships of primates. You have learned that phylogenetics is the reconstruction of the evolutionary history of organisms using evidence from the fossil record, embryology, morphological features and even DNA sequences. The basic assumption, when using molecular data, is that the DNA sequences of organisms are a record of the DNA sequences passed down from previous ancestors. As a rule, organisms with more similar DNA sequences share more recent common ancestry than organisms with more dissimilar DNA sequences. The DNA sequences may also be used to calibrate the time since divergence between species, since DNA sequence divergence represents a type of “molecular clock.” To illustrate, Wildman et al (2004) analyzed mitochondrial DNA (mtDNA) sequences to estimate divergence between Arabian and African hamadryas baboons at approximately 35,000 years ago. Representative DNA sequences from other types of baboons were significantly different from the hamadryas sequences, including Papio and Theropithecus whose divergence has been dated at about 4 million years ago (mya) using fossil evidence.

As mentioned previously, some genes may accumulate mutations more rapidly than others and this may, or may not, be useful for estimating evolutionary relationships. In general, the DNA sequences best suited for reconstructing evolutionary relationships are those that are not dramatically affected by natural selection, in other words “selectively neutral.” Neutral genes are thought to accumulate mutations at a fairly even rate and, therefore, it should be possible to estimate time since divergence. You have learned that some DNA sequences have been used, preferentially, for estimating evolutionary relationships between modern human and nonhuman primate populations (i.e., mitochondrial DNA). Other DNA sequences, such as the beta domain of the hemoglobin gene, are more suitable for comparisons between distantly related organisms.

Whatever types of DNA sequences are compared, the first step is usually to

construct an alignment. A sequence alignment is simply a side-by-side comparison of the order of nucleotides for each individual included in a study. Usually these alignments are done in successive rows (see Figure 10a).

The next step is to create a phylogenetic tree based upon calculations of the

evolutionary distance between all pairs of sequences in the alignment. Typically, this is also achieved using a computer program. One of the simplest ways to calculate distances is to determine the number of differences per nucleotide of the sequence pair. For example, if we compare 1,000 nucleotides from a non-coding region of the hemoglobin genes in humans and chimpanzees we will find that there are 145 differences between the two sequences. The evolutionary distance between human and chimpanzee would, therefore, be 1.45 (see Li et al., 1987). To help put this in perspective, the evolutionary distance between human/chimpanzee and gorilla is 1.54 and the distance between human/ chimpanzee and orangutan is 2.96.

20

Cengage Learning

Page 25: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

As a rule, phylogenetic trees are constructed with a minimum number of steps to get from one sequence to the next. This is called a maximum parsimony tree. When a large number of DNA sequences are used in an analysis, several models may be equally parsimonious. Another important consideration is the identification of a common ancestral sequence for all of the sequences in question. The ancestral sequence is usually represented as the “root” of the phylogenetic tree. In some cases, however, it is not possible to identify the root of a tree. An “unrooted” tree depicts the relationship between groups of sequences, but it does not identify the oldest, or ancestral, sequence.

A number of critical assumptions must be considered when molecular trees are

constructed. First, it is assumed that each nucleotide position in an alignment evolves independently of every other nucleotide. Second, it is assumed that the sequences used in the alignment are representative of the organisms. Third, and most importantly, it is assumed that the sequences and the nucleotides compared in the alignment are homologous (i.e., inherited from a common ancestor). If nucleotide sequences are shared between species, but not homologous, the similarity may have arisen because of convergence (i.e., the same nucleotide substitution has occurred at the same position in different evolutionary lineages). Convergence usually occurs because the two species have faced the same selection pressures. Similarity may also exist when homologous genes have been duplicated within one or more species (paralogy) and alignments contain nucleotide sequences from the duplicated genes. For example, the human adult beta globin and chimpanzee beta globin genes are homologous because they have been derived from a common ancestor. However, the two human alpha globin genes are paralogous, because they have been duplicated since the divergence of humans and chimpanzees. Importantly, if nucleotide sequences from different species are not homologous, then the evolutionary inferences drawn from the trees will be inaccurate.

Species trees versus gene trees

Following on from the point about homology and phylogenetic trees is the fact that some trees will depict an evolutionary relationship between organisms and other trees will depict an evolutionary relationships between genes. The use of highly variable, non-coding regions of the genome can provide a phylogeny that reflects the evolutionary relationship between the representative species that have contributed DNA sequences for the analysis. This is based upon the assumption that the rise of new DNA sequences coincides with speciation events. For some regions of the genome, however, original DNA sequences may persist in two species that have already diverged, while new DNA sequences may arise without any speciation events. In these cases, it is not appropriate to attempt a reconstruction of species relationships using these DNA sequences. Instead, the only meaningful evolutionary relationships that can be described are those that exist between the genes. Gene trees can tell you about the evolutionary history of particular DNA sequences, not the species that possess them (see Figure 11). In some cases, gene trees will disagree with species trees. Nevertheless, if one uses DNA sequences from many different genes and/or non-coding regions of the genome, it is possible that the average branching pattern will better represent the species tree.

21

Cengage Learning

Page 26: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

How are we related to the Neandertals?

Molecular genetic studies using ancient DNA are extremely difficult, but they have recently been instrumental in helping scientists understand how we are related to the Neandertals. As you have learned, there have been great debates on the evolutionary place of Neandertals relative to Homo sapiens sapiens. Some have suggested that they are our direct ancestors (and are probably a subspecies of Homo sapiens). Others have argued that they are a separate species. Using only the fossil evidence, this issue is very difficult to resolve. However, the use of ancient DNA may offer an answer to this long-standing question.

In 1997, a very minute quantity of degraded DNA was extracted from the original Neandertal skeleton discovered in the Neander Valley, Germany. Using PCR and additional molecular genetic techniques, scientists were able to examine the relationship between Neandertals and modern humans in an entirely new way. PCR amplification of the mitochondrial genome was necessary because of the fact that cells contain many more mitochondria than nuclei. The use of mitochondrial DNA was also advantageous because, as explained earlier, the DNA sequence of the entire mitochondrial genome is known. A small segment, just 379 base pairs of the hypervariable control region, was amplified. For scientists studying modern DNA, it would be very easy to PCR such a small fragment, but most of the Neandertal PCR fragments were less than 100 bp due to degradation of the DNA. To produce a complete 379 bp sequence, the scientists used a procedure called cloning and sequencing (see below) to create a series of overlapping fragments. When the complete sequence was compared with DNA sequences of the same region in modern humans, it was discovered that the Neandertal sequence was significantly different from all modern sequences. While the most divergent modern vs. modern human population sequences differed at 24 positions, pairs of modern human-Neandertal sequences differed, on average, at 25.6 positions, with no less than 20 differences between the most similar modern-Neandertal pair. Considering the differences, as well as the type and location of the DNA sequence data, scientists concluded that Neandertals were not the sole ancestors of any modern human population (Krings et al., 1997).

Supporting evidence for this conclusion came in 2000, when ancient DNA was extracted from a Neandertal skeleton discovered in the northern Caucasus (Ovchinnikov et al., 2000). The DNA from this second and unrelated Neandertal specimen was PCR amplified to obtain two overlapping fragments of 232 and 256 bp, yielding a total of 345 bp of the same hypervariable region of the mitochondrial genome. The PCR fragments were directly sequenced using the Sanger sequencing method (see above). When compared to a modern human reference sequence, this second Neandertal sequence differed at 22 positions. Comparison with the original Neandertal sequence revealed only 12 differences. Once again, there was evidence to suggest that the Neandertals were not directly ancestral to modern humans. But, like the 1997 study, the 2000 study still cannot exclude with certainty the possibility that anatomically modern humans and Neandertals exchanged some genes. Nuclear DNA could provide a clearer picture of our relationship to Neandertals, but this is thought to be extremely difficult to obtain since ancient DNA from the nucleus would be less abundant, and even more degraded, than

22

Cengage Learning

Page 27: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

mitochondrial DNA. Nevertheless, recent studies have successfully obtained cave bear nuclear DNA from the Vindija Neandertal site (O’Rourke et al., 2000). These successes suggest that nuclear DNA studies of Neandertals may be possible in the future.

So, how do we, modern humans, compare to our closest nonhuman primate

relative, the chimpanzee? The scientists studying Neandertal DNA in 1997 had an interesting answer to this question: modern humans and modern chimpanzees differ at about twice as many positions as modern humans and Neandertals (Krings et al., 1997). According to the fossil record, it would have taken approximately 4-5 million years for the modern human- modern chimpanzee differences to arise. Therefore, Krings concluded, one could create a “molecular clock” to estimate the divergence of Neandertals and modern humans. Considering the types of changes that might occur in the hypervariable region of the mitochondrial genome, the scientists concluded that modern human and Neanderthal mitochondria began to diverge approximately 550,000 to 690,000 years ago. This would be about four times older than the last common ancestor of all modern humans.

Hss CCAAGTATTGACTTACCCATCAAC N -------------C--------G- Pt -T--------G-C—-TT---T-*-

chimpanzees

Neandertal

modern humans

Learning

Figure 10a: A mitochondrial DNA (mtDNA) sequence alignment for three species, human (Hss), Neandertal (N) and chimpanzee (Pt) . This small alignment shows a portion of the mtDNA D-Loop. It is conventional to indicate agreement of sequence with a dash (-), to note mismatches with the specific nucleotide (C) and indicate missing nucleotides with an asterisk (*). age

The study of ancient DNA is fraught with controve

contamination from modern humans who originally collectDNA extraction or laboratory researchers who undertake thexperiments. Nevertheless, the studies of ancient Neanderexemplary. Laboratory controls were used to detect contamat many stages of these studies. Indeed, the 1997 study repdata, identified as modern human, was excluded from the abecome even more sophisticated, and scientists become moworking with minute quantities of degraded DNA, it is likeprovide a revolutionary perspective on the relationships betheir ancestors.

23

Ceng

Figure 10b: Phylogenetic tree constructed from human, Neandertal and chimpanzee mtDNA sequences.

rsy due to serious problems of ed the ancient bones used for e molecular genetics

thal DNA have been ination from modern humans

orts that some contaminating nalyses. As techniques re aware of the hazards of ly that molecular genetics will tween modern humans and

Page 28: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Protein structure and function You have already learned that only about 2% of the human genome consists of exons, or coding DNA sequences. As you also know, the DNA in exons serves as the set of instructions for building proteins. One of the double strands of DNA is used as a template for the transcription of mRNA in the nucleus. With the help of tRNA and rRNA, the message is ultimately translated into a long string of amino acids to form a protein or part of a protein. The function of any protein depends on its shape and structure. Protein structure

The sequence of amino acids is formed as a long chain held together by peptide bonds. The primary structure of a protein is the amino acid sequence, which determines the higher levels of structure of the protein and its biological function. The secondary structure of a protein is determined by the way the protein folds, and the third is basically the protein’s shape. Moreover, some proteins combine with other proteins as subunits of a larger, more complex protein. The quaternary structure is the arrangement of the protein subunits that form the larger functional protein. For example, the protein that carries oxygen through the blood, hemoglobin, is composed of two alpha globins and two beta globins.

The functional diversity of proteins

Proteins are extremely diverse and complex. They include receptors for recognizing other proteins or chemicals, enzymes for DNA and RNA synthesis and hormones for triggering biological responses. Proteins also form antibodies as part of the immune response and, as you have learned, an antibody combines with an antigen to form a complex that can stimulate an immune reaction. Antibodies perform a number of other tasks as well, including: some antibodies attack microbes and bacterial toxins; others are involved in allergic reactions; and still others are responsible for initiating the destruction of infected cells. Antibodies exhibit a great deal of variability because different genes that code for them vary tremendously.

For many immune response genes, natural selection has favored nucleotide

diversity and some of the resulting amino acid changes have had advantageous consequences for individuals (see Box 1). Contrastingly, some proteins have strong functional constraints since most amino acid changes have negative, or deleterious, consequences and natural selection does not generally favor nucleotide substitutions. Often such proteins form important structures for activities such as chromosome formation. For example, histones are DNA-binding proteins that mediate the coiling of DNA during chromosome condensation prior to cell division. If you compare the histone genes of many different primate species, you will discover that there are almost no

24

Cengage Learning

Page 29: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

differences in the nucleotide sequence. In fact, if you compare one of the histone (H4) genes of humans and wheat, only 2 out of 104 amino acids differ. This degree of similarity indicates that the histone genes have been highly conserved throughout the course of evolution because of the importance of histones in chromosome structure and function in all forms of life.

Although technically more demanding, the study of coding DNA sequences offers

scientists an opportunity to understand gene function and evolution. The information encoded in exons can be translated into a real functional protein with a defined three-dimensional shape, since scientists are developing computerized models that predict protein structure and aid in understanding gene function. For example, identification of nucleotide sequences that result in a flawed proteins requires knowledge of correct protein structure, which is essential for determining gene function. Comparisons of coding sequences within a species can help scientists understand the functional importance of nucleotide substitutions.

mRNA Studies

Many scientists are interested in studying the functional diversity of proteins in humans and nonhuman primates. This can be done by extracting mRNA from fresh, nucleated cells and employing almost any one of the molecular techniques previously described for use with DNA. Studies of mRNA are significantly more time and labor intensive since mRNA has an extremely short life span and it is only found in minute quantities within nucleated cells. Before any further molecular genetic study can be undertaken a complementary strand of nucleotides must be synthesized to produce a double stranded DNA-like molecule. Typically, this procedure is followed by amplification using PCR, as described for DNA. The two steps together are known as reverse transcriptase-polymerase chain reaction (RT-PCR) since the synthesis of the complementary strand requires the reverse transcriptase, an enzyme essential for the replication of mRNA.

BOX 1: MHC genes, immune response and evolution

Humans and nonhuman primates face a huge range of dangerous and rapidly changing pathogens in their natural habitats. Disease-causing organisms usually have short generation times and an ability to adapt quickly to their host. When you consider that these organisms can also cause mortality, they clearly represent powerful agents of evolution. In humans, epidemics such as the bubonic plague, caused by the insect borne bacterium Yersinia pestis, killed up to 20 million Europeans in the 1300s. Currently, HIV is lowering life expectancy and reversing gains in child survival in east and central Africa, and it is spreading rapidly in South Asia as well.

Most vertebrates cope with these challenges through immune response. Some blood

proteins reduce the likelihood of disease, but a much more complex genetic system provides a key barrier to infection by disease-causing organisms. The major histocompatibillity complex (MHC) is sometimes considered the center of the immune universe since it consists of many genes directly involved in battling parasitic infections. In humans, the MHC is also known as the human leukocyte antigen (HLA) complex. HLA genes occupy more than 600,000 bases of the

25

Cengage Learning

Page 30: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

BOX 1

: MHC genes, immune response and evolution (continued)

entire 3,800,000 HLA complex on chromosome 6. Using molecular genetic methods, hundreds of different alleles have been identified at some HLA loci. In humans, HLA polymorphism is so great that it is theoretically possible for every single person to possess a genetically different combination of HLA alleles. In recognition of medical and evolutionary implications many scientists have been using molecular genetic techniques such as PCR, recombinant DNA cloning and DNA sequencing to study MHC genes in nonhuman primates. Not surprisingly, comparative studies of DNA sequences from chimpanzee and gorilla MHC genes have revealed a remarkable degree of similarity with human DNA sequences. In some cases, MHC alleles may be more similar between two species than within each species. ForDRB1 locus in humans (HLA) and chDRB1*0302 and HLA-DRB1*0701) hhas with its chimpanzee counterpart HLA-DRB1*0701/Patr-DRB1*0702: humans and chimpanzees indicate thacommon ancestor (Klein et al., 1993)

MHC diversity could be mainMolecular genetic studies of HLA nunonsynonymous substitutions in the Additionally, there is evidence that thresistance to certain pathogens. In WHLA-DRB*1302 ) are found in indivi(Hill et al.,1992). Similar HLA assochepatitis B virus resistance.

Since resistance to infectious

individuals also have behavioral and heterozygosity. Studies of humans an

Patr-DRB1*0305

Patr-DRB1*0702

HLA-DRB1*0302

HLA-DRB1*0701

Patr-DRB1*0305

Patr-DRB1*0702

HLA-DRB1*0302

HLA-DRB1*0701

Cenga

arning

Figure 11: Some human and chimp genes are very similar. A gene tree of human (HLA) and Pan troglodytes (Patr) MHC alleles shows that some alleles are more similar between species than within species. (After Klein, Takahata and Ayala, 1993) Le

26

example, comparisons of DNA sequences from the MHC-impanzees (Patr) show that the human alleles (HLA-ave more sequence differences (31 nucleotides) than either

(HLA-DRB1*0302/ Patr-DRB1*0305: 13 nucleotides and 2 nucleotides)(see Figure 11). These similarities between t the alleles were inherited by both species from their shared .

tained by natural selection favoring heterozygotes. cleotide sequences demonstrate a significantly higher rate of regions directly involved in immune response. e inheritance of particular HLA molecules provides est Africa, for example, certain HLA alleles (HLA-B53 and duals who are resistant to Plasmodium falciparum malaria iations have been described for HIV progression and

disease is so important, it should not be surprising that biological mechanisms for maintaining MHC d mice have revealed that MHC-based mating preferences

ge

Page 31: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

BOX 1: MHC genes, immune response and evolution (continued)

for partners with different MHC types preferentially produce MHC heterozygous progeny with higher Darwinian fitness (see Apanius et al., 1997). In mice, MHC genes actually affect an individual’s odor and it has been suggested that mate choice, and even kin recognition, is based upon their odor cues. The ability to discriminate MHC-based odors has also been observed in humans, with some humans capable of recognizing mates and relatives on the basis of olfactory cues. Some scientists have also discovered that women exhibit preferences for odors from males with particular HLA alleles, usually different from their own. In 1995, Claus Wedekind, a zoologist at Bern University in Switzerland, tested women’s responses to sweaty T-shirts. He found that women preferred the scent of T-shirts from men who had the most dissimilar HLA types to their own. Wedekind argued that these results indicate that body odor plays a role in female mate choice.

Currently, studies of how MHC genes influence odor, mate choice and disease resistance are underway in several laboratories. Although complex, the genes of the MHC have the potential to provide us with important insight into evolution, behavior and reproduction in humans and nonhuman primates.

MHC genes are also of great importance in the field of medicine for several reasons.

First, they play a major role in mediating tissue transplantation. Successful organ or bone marrow transplantation requires matching of as many HLA alleles as possible, since any difference between donor and recipient can prompt vigorous, and sometimes fatal, immune-mediated rejection of the transplant. To increase the likelihood of HLA matching, relatives are often encouraged to donate bone marrow for transplants. Another reason that MHC genes are important in medicine is the fact that many autoimmune diseases, like diabetes and rheumatoid arthritis, occur more frequently in individuals with certain HLA alleles. For example, insulin-dependent diabetes mellitus occurs more often than expected in individuals with the HLA-DQB1*0302 allele and about 90% of people with an inflammatory disease of the hips and spine, known as ankylosing spondylitis, have the HLA-B27 allele (see Hill, 2001). Given the relationship between MHC and immune response, it should not be surprising that HLA alleles have a role in predisposition to auto-immune diseases. Nevertheless, it might seem puzzling that some alleles that predispose individuals to autoimmune diseases are common in contemporary populations. Apanius et al. (1997) suggest that these alleles may be maintained because they confer some benefit, such as resistance to infectious diseases, that outweighs the deleterious effects from autoimmunity.

Recombinant DNA Technology and Human Evolution Almost everyday we hear about new breakthroughs in biotechnology and molecular genetics. Cloning is one of those relatively new advances that attracts media attention. We hear about claims that humans are being cloned, but the idea of an exact human replica is preposterous. Human clones, if they were ever successfully created, would be no more replicas of each other than identical twins are. Even identical twins, with the same genetic make-up, still exhibit physical and behavioral differences due to numerous environmental factors (e.g., experiences, education, nutrition). Furthermore, attempts to clone complex organisms such as mice, sheep and even monkeys have had

27

Cengage Learning

Page 32: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

mixed success and it is unlikely that similar experiments would be attempted with humans in the near future. The recent experiments by Korean scientist (see Tamkins, 2004), creating human embryo clones for the production of therapeutic stem cells, apparently do not represent an attempt to produce living children.

Cloning also has an important place in molecular genetic research. In this setting, cloning is also known as recombinant DNA technology (see In The Lab 7).

In The Lab 7: DNA Cloning Recombinant DNA clones are produced by combining short segments of DNA from one

organism (like a human) with DNA from other organisms (such as a bacterium). Typically, the short segment of DNA is generated using PCR and inserted, thorough physical and chemical means, into a carrier organism called a plasmid. Plasmids have small circular genomes, not unlike mitochondria, and are unable to live without a host. They are a bit like parasites, except they generally do not cause problems for their hosts. Bacterial cells often serve as the host for plasmids in molecular genetics experiments. When the bacterium replicates itself, the foreign DNA and plasmid will also be replicated, or cloned.

In molecular genetics laboratories,

cloning with plasmids and bacteria provides researchers with the ability to study short segments of DNA in detail. You can see what bacterial clones look like in Figure 12. Nucleotide sequences can be determined and the function of DNA sequences can be experimentally studied. Recombinant DNA technology has been used to artificially synthesize proteins as well. Human insulin is produced in this way and diabetics benefit from this modern and efficient application of molecular genetics. age Learning

Figure 12: Cloning and sequencing is used to identify new genes. This graduate student at the University of Cambridge identifies genetically modified (i.e., recombinant DNA) clones.

g

Molecular genetic research has also contributed to advances in disease diagnosis and medical treatment. In some cases, the advances relate to our ability to identify individuals predisposed to a disease before symptoms even develop. In other cases, drug therapies have been revolutionized by the advances arising from genome sequence data. Scientists in the field of “structural genomics” use DNA sequence data to understand and create new proteins as discussed above. Many discoveries in this field of research are relevant for drug design and improvement of human health (see Pistoi, 2002).

28

Cen

Page 33: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

However, amidst the hope and hype surrounding molecular genetic technology, there are also difficult and, so far, unanswered questions about how this knowledge will be best used. For example, gene therapy is intended to replace damaged genes with healthy ones, often using a disarmed virus to deliver a package of "good" DNA into the patient's cells. When doctors tried to use gene therapy to treat the rare metabolic disease of a 19 year old male in Philadelphia in 1999, however, the therapy turned out to be fatal. Gene therapy may still be too new and unpredictable to use widely on humans. In February 2001, the scientific journals Nature and Science devoted special sections to studies of the Human Genome and its relevance for human health.

Although progress in the fields of cloning, recombination DNA technology and structural genomics may be slow, some people are concerned about the ways in which these advances will affect our species and its evolution. The human genome, as we have seen, is subject to mutations and is constantly evolving. Evolution of any kind involves genetic change and so long as recombinant DNA technology and molecular genetic medical treatment advances knowledge and reduces disability and disease there is good reason to encourage further responsible research.

Acknowledgements My thanks to Robert Jurmain and Lynn Kilgore for offering me the opportunity to write this module and for their helpful editorial suggestions. Thanks also to Kristin Abbott, Lynn Kilgore, Robert Jurmain, Simon Middleton, Julie Robson and Jean Wickings for their contributions to the photos in the module and to Emma Wainwright for the cover illustration. Finally, thanks to DKK for helpful suggestions throughout the preparation of this module. Suggested Discussion Questions

1. What are the key differences between the nuclear and mitochondrial genomes and how can these differences be used to study human variation?

2. How and why is mitochondrial DNA so useful for understanding human

evolution?

3. What are the potential problems associated with molecular genetic studies of ancient DNA.

4. What are the potential problems associated with molecular studies of non-

invasively collected DNA.

5. How has the development of the polymerase chain reaction (PCR) contributed to the study of human evolution?

29

Cengage Learning

Page 34: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

6. Does non-coding DNA provide any useful information for identifying individuals? Explain.

7. What are restriction fragment length polymorphisms (RFLPs) and how are they

used to identify individuals?

8. What are microsatellites and why are they useful for studying human variation?

9. Why would studies of Y chromosomes and mitochondrial DNA give different results in studies of modern humans?

10. How can studies of the chimpanzee genome contribute to our understanding of

human evolution? Bibliography Apanius V., D. Penn, P.R. Slev, L.R. Ruff and W.K. Potts (1997) The nature of selection on the major histocompatibility complex. Critical Reviews in Immunology, 17(2):179-224. Bradley, B. and L. Vigilant (2002) False alleles derived from microbial DNA pose a potential source of error in microsatellite genotyping of DNA from faeces. Molecular Ecology Notes, 2:602-605. Graur, D. and W. Martin (2004) Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends in Genetics, 20(2):80-86. Hill, A.V. (2001) Immunogenetics and genomics. Lancet, 357(9273):2037-2041. Hill, A.V., J. Elvin, A.C. Willis, M. Aidoo, C.E. Allsopp, F.M. Gotch FM, X.M. Gao et al. (1992) Molecular analysis of the association of HLA-B53 and resistance to severe malaria. Nature, 360: 434-439. Klein, J, N. Takahata and F.J. Ayala (1993) MHC polymorphisms and human origins. Scientific American, Dec. 1993:78-83. Krings M, A. Stone, R.W. Schmitz, H. Krainitzki, M. Stoneking and S. Paabo (1997) Neandertal DNA sequences and the origin of modern humans. Cell, 90(1):19-30. Li, H.W., K.H. Wolfe, J. Sourdis and P.M. Sharp (1987) Reconstruction of phylogenetic trees and estimatation of divergence times under constant rates of evolution. Cold Spring Harbor Symposium in Quantitative Biology, 52:847-856. Messier, W. and C.B. Stewart (1997) Episodic adaptive evolution of primate lysozymes. Nature, 385(6612):151-154.

30

Cengage Learning

Page 35: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Morin P.A., K.E. Chambers, C. Boesch and L. Vigilant (2001) Quantitative polymerase chain reaction analysis of DNA from noninvasive samples for accurate microsatellite genotyping of wild chimpanzees (Pan troglodytes verus). Molecular Ecology, 10(7):1835-44. Moxon, E.R. and C. Wills (1998). DNA microsatellites: agents of evolution? Scientific American, 280(1):94-99. O’Rourke, D.H., M.G. Hayes and S.W. Carlyle (2000) Ancient DNA studies in physical anthropology. Annual Reviews of Anthropology, 29:217-242. Ovchinnikov I.V., A. Gotherstrom, G.P. Romanova, V.M. Kharitonov, K. Liden and W. Goodwin (2000) Molecular analysis of Neanderthal DNA from the northern Caucasus. Nature, 404(6777):490-493. Pistoi, S (2002) Facing your genetic destiny. see www.sciam.com/article.cfm?articleid= 00016A09-BE5F-1CDAB4A8809EC5888EEDF Shaw J.P., J. Marks, C.C. Shen and C.K. Shen (1989) Anomalous and selective DNA mutations of the Old World monkey alpha-globin genes. Proceedings of the National Academy of Sciences, U S A, 86(4):1312-1316. Sibley, C. G. and J.E. Ahlquist (1984) The phylogeny of hominoid primates, as indicated by DNA-DNA hybridization. Journal of Molecular Evolution, 20: 2-15 Tamkins, T. (2004) South Koreans create human stem cell line using nuclear transfer. Lancet, 363(9409):623. Ward, R. and C. Stringer (1997) A molecular handle on the Neanderthals. Nature, 388:225-226. Wedekind C., T. Seebeck, F.Bettens and A.J. Paepke (1995) .MHC-dependent mate preferences in humans. Proceedings of the Royal Society of London, Biological Sciences, 260(1359):245-249. Wildman D.E., M. Uddin, G. Liu, L.I. Grossman and M. Goodman (2003) Implications of natural selection in shaping 99.4% nonsynonymous DNA identity between humans and chimpanzees: enlarging genus Homo. Proceedings of the National Academy of Sciences, U S A, 100(12):7181-7188. Wildman D.E., T.J. Bergman, A. Al-Aghbari, K.N. Sterner, T.K. Newman, J.E. Phillips-Conroy, C.J. Jolly and T.R. Disotell (2004) Mitochondrial evidence for the origin of hamadryas baboons. Molecular Phylogenetics and Evolution, 32(1):287-296.

31

Cengage Learning

Page 36: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Suggested Readings and Internet Sites Molecular Anthropology and the Human Genome Collins, F.S., M. Morgan and A. Patrinos (2003) The Human Genome Project: Lessons from Large-Scale Biology. Science, 300: 286-290 Sources of DNA and Biological Sample Collection Hoefreiter, M., D. Serre, H.N. Poinar, M. Kuch and S. Paabo (2001) Ancient DNA. Nature Reviews, 2:353-359. DNA Extraction Cooper, A. and H.N. Poinar (2000) Ancient DNA: Do It Right or Not at All. Science, 289(5482): 1139. Principles and Applications of the Polymerase Chain Reaction (PCR) Mullis, K. B. (1990) The Unusual Origin of the Polymerase Chain Reaction" Scientific American, April, pp.36-39. Repetitive DNA Sequences Goodwin, W., A. Linacre, and P. Vanezis (1999). The use of mitochondrial DNA and short tandem repeat typing in the identification of air crash victims. Electrophoresis, 20, 1701-1711. DNA-based Trees and Evolution Build a Molecular Clock: The Origin of HIV www.smccd.net/accounts/case/CPS/400.html Generation of Phylogenetic Tree based upon DNA sequence analysis. www.bioweb.uwlax.edu/GenWeb/Evol_Pop/Phylogenetics/Exercise/exercise.htm Protein Structure and Function Genetic Science Learning Center, University of Utah

www.gslc.genetics.utah.edu/units/basics/ MHC Genes, Immune Response and Evolution Knapp, L.A. (2002) Evolution and Immunology. Evolutionary Amthropology, 11(S1):140-144. Knapp, L.A. (in press) The ABCs of MHC. Evolutionary Anthropology. Recombinant DNA Technology and Human Evolution Facing Your Genetic Destiny www.sciam.com/article.cfm?articleid=00016A09-BE5F-1CDAB4A8809EC5888EEDF

32

Cengage Learning

Page 37: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

Glossary alleles Alternate forms of a gene. Alleles occur at the same locus on homologous chromosomes and thus govern the same trait. However, because they are different, their action may result in different expressions of that trait. The term is sometimes used synonymously with gene. amino acids Small molecules that are the components of proteins. anneal To join together. In molecular genetics, two single-strands of DNA can anneal to form one double-stranded molecule. antigen Large molecule found on the surface of cells. Several different loci govern various antigens on red and white blood cells. (Foreign antigens provoke an immune response.) antibody Proteins that are produced by some types of immune cells and that serve as major components of the immune system. Antibodies recognize and attach to foreign antigens on bacteria, viruses, and other pathogens. Then other immune cells destroy the invading organism. base-pairs (bp) Pairs of nucleotides held together by hydrogen bonds. The nucleic acid adenine (A) pairs with thymine (T) and guanine (G) pairs with cytosine (C). chromosomes Discrete structures composed of DNA and protein found only in the nuclei of cells. Chromosomes are only visible under magnification during certain phases of cell division. clone An organism that is genetically identical to another organism. The term may also be used to refer to genetically identical DNA segments, molecules, and cells. complementary Referring to the fact that DNA bases form base pairs in a precise manner. For example, adenine can bond only to thymine. These two bases are said to be complementary because one requires the other to form a complete DNA base pair. cytoplasm The portion of the cell contained within the cell membrane, excluding the nucleus. The cytoplasm consists of a semifluid material and contains numerous structures involved with cell function. data (sing., datum) Facts from which conclusions can be drawn; scientific information.

33

Cengage Learning

Page 38: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

deletions/deletion mutations A change in DNA sequence due to the loss of one or more nucleotides. denaturation The physical separation of a molecule, usually through heat or chemical means. When the hydrogen bonds of double-stranded DNA are broken and two single-strands are formed, the DNA is denatured. deoxyribonucleic acid (DNA) The double-stranded molecule that contains the genetic code. DNA is a main component of chromosomes. derived (modified) Referring to characters that are modified from the ancestral condition and thus are diagnostic of particular evolutionary lineages.

domain Region of a protein with distinct structure and characteristic function that is determined by the protein’s tertiary structure. The tertiary structure is the way in which the strings of amino acids of the protein fold with respect to each other.

double-stranded The usual and most stable structure for DNA. Two single-strands of DNA are usually paired together to form one double-stranded molecule. A double-strand forms when nucleotides are paired through hydrogen bonding, with G pairing with C and A pairing with T. Double-stranded DNA is also known as a duplex.

duplex Double-stranded DNA molecules. enzymes Specialized proteins that initiate and direct chemical reactions in the body. evolution A change in the genetic structure of a population. The term is also frequently used to refer to the appearance of a new species. The modern genetic definition is a change in the frequency of alleles from one generation to the next. exon Regions of a gene that consist of nucleotides that will be translated into proteins during protein synthesis. extra-genic DNA DNA sequences that do not code for proteins (i.e., they are not genes). gel electrophoresis A technique for separating DNA sequences that differ by length or nucleotide sequence. gene A sequence of DNA bases that specifies the order of amino acids in an entire protein, a portion of a protein, or any functional product. A gene may be made up of hundreds or thousands of DNA bases organized into coding and noncoding segments.

34

Cengage Learning

Page 39: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

genetics The study of gene structure and action and the patterns of inheritance of traits from parent to offspring. Genetic mechanisms are the underlying foundation for evolutionary change. genome The entire genetic makeup of an individual or species. In humans, it is estimated that each individual possesses approximately 3 billion DNA nucleotides. hemoglobin A protein molecule that occurs in red blood cells and binds to oxygen molecules. heterozygous/heterozygote Having different alleles at the same locus on members of a chromosome pair. Can be contrasted with homozygous/homozygote Having the same allele at the same locus on both members of a chromosome pair.

HLA The human major histocompatibility complex (MHC). There are hundreds of HLA genes, many of which are involved in immune response and disease resistance. HLA genes also provide an immunological marker for genetic self-identity.

homology/homologous Similarity between organisms based on descent from a common ancestor. Human Genome Project An international effort aimed at sequencing and mapping the entire human genome. hybridization The creation of double-stranded DNA sequences by allowing single-stranded DNA sequences to anneal. insertions/insertion mutations A change in DNA sequence due to the addition of one or more nucleotides. intron Regions of a gene that consist of nucleotides that will not be translated into proteins during protein synthesis. Introns are removed from the protein-coding sequence of a gene during editing of the mRNA. in vitro Chemical or biological reaction that take place in a test tube. locus (pl., loci) The position on a chromosome where a given gene occurs. The term is sometimes used interchangeably with gene, but this usage is technically incorrect. long interspersed nucleotide element (LINE) A type of repetitive DNA sequence that is about 14,000 to 61,000 bp in length. More than 50,000 copies of LINEs have been identified in the human genome. LINEs are found in most mammalian genomes and generally do not code for proteins.

35

Cengage Learning

Page 40: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

meiosis Cell division in specialized cells in ovaries and testes. Meiosis involves two divisions and results in four daughter cells, each containing only half the original number of chromosomes. These cells can develop into gametes. messenger RNA (mRNA) A form of RNA that is assembled on a sequence of DNA bases. It carries the DNA code to the ribosome during protein synthesis.

Major Histocompatibility Complex (MHC) A genetic complex of vertebrate genes provide an immunological marker for genetic self-identity. (Also known as the HLA complex in humans.)

mis-sense mutation A change in DNA sequence that results in an incorrect amino acid or a stop codon. mitochondria (sing., mitochondrion) Structures contained within the cytoplasm of eukaryotic cells that convert energy, derived from nutrients, into a form that is used by the cell. mitochondrial DNA (mtDNA) DNA found in the mitochondria; mtDNA is inherited only from the mother. molecules Structures made up of two or more atoms. Molecules can combine with other molecules to form more complex structures. mutation A change in DNA. Mutation refers to changes in DNA nucleotides (specifically called point mutations) and also to changes in chromosome number and/or structure. natural selection The mechanism of evolutionary change first articulated by Charles Darwin; refers to genetic change or changes in the frequencies of certain traits in populations due to differential reproductive success between individuals. neutral mutation A change in DNA sequence that does not change the amino acid sequence of a gene. non-coding DNA DNA sequences that do not code for proteins. nucleated cells Somatic cells that contain a nucleus and, therefore, a copy of the organism’s nuclear genome. Most cells in the body are nucleated. Some cells, like red blood cells, do not contain a nucleus. nucleotides Basic units of the DNA molecule, composed of a sugar, a phosphate, and one of four DNA bases. nucleus A structure (organelle) found in all eukaryotic cells. The nucleus contains chromosomes (nuclear DNA).

36

Cengage Learning

Page 41: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

paralogy/paralogous Homologous due to a recent or past duplication in the same species. pathogens Substances or microorganisms, such as bacteria, fungi, or viruses, that cause disease. peptide bonds The chemical bonds that hold individual amino acids together to produce a protein. phylogenetic tree A chart showing evolutionary relationships as determined by phylogenetic systematics. It contains a time component and implies ancestor-descendant relationships. point mutation A chemical change in a single base of a DNA sequence. polymerase An enzyme that is directly involved in the synthesis of new strands of DNA or RNA. polymerase chain reaction (PCR) A method of producing thousands of copies of a DNA segment using the enzyme DNA polymerase. polymorphisms Loci with more than one allele. Polymorphisms can be expressed in the phenotype as the result of gene action (as in ABO), or they can exist solely at the DNA level within noncoding regions. primer A short sequence of single-stranded DNA that is responsible for initiating the synthesis of new strands of DNA or RNA. probe A single-stranded sequence of DNA that is used to identify complementary sequences of single-stranded DNA. Probes hybridize to their complementary sequence. protein Three-dimensional molecule that serve a wide variety of functions through an ability to bind to other molecules.

pseudogene: Gene that has acquired a nonsense mutation and is no longer transcribed.

recombinant DNA When genes, or parts of genes, from one species are transferred to somatic cells or gametes of another species. replicate To duplicate. The DNA molecule is able to make copies of itself. restriction endonuclease An enzyme that can be used to cut (or restrict) particular DNA sequences. For example, the restriction endonuclease EcoRI will cut the DNA sequence GAATTC. Restriction endonuclease and restriction enzyme are used interchangeably.

37

Cengage Learning

Page 42: Leslie A. Knapp Department of Biological Anthropology University of Cambridge · 2008-11-14 · Molecular Anthropology and the Human Genome 1 Nuclear DNA 2 Mitochondrial DNA 3 Sources

restriction fragment length polymorphism (RFLP) Genetic polymorphism that is revealed by the different sizes of fragments generated with a particular restriction endonuclease (such as EcoRI).

reverse-transcriptase An enzyme that is used to synthesize new strands of DNA from an mRNA sequence. ribonucleic acid (RNA) A single-stranded molecule, similar in structure to DNA. Three forms of RNA are essential to protein synthesis. They are messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA). sequence A string of nucleotides. For genes, the precise order of the nucleotides will determine the amino acids that make up a protein. short-interspersed nuclear element (SINE) A type of repetitive DNA sequence that is about 300 bp in length. More than 500,000 copies of the Alu SINE have been identified in the human genome. SINEs are found in most mammalian genomes, but the Alu SINE is unique to primates. sickle-cell anemia A severe inherited hemoglobin disorder that results from inheriting two copies of a mutant allele. This allele results from a single base substitution in the DNA. single-stranded An unstable, and usually temporary, structure for DNA. Single-strands of DNA usually pair together to form one double-stranded molecule. tandem Literally, side-by-side. Tandem repeats are DNA sequences that are repeated side-by-side.

transcription The first step in protein synthesis. Transcription is the transfer of genetic information from the DNA template to RNA. It is followed by translation of the RNA into amino acids and then proteins.

38

Cengage Learning