human genome 2009

23
Jennifer Lyon, MS, MLIS Eskind Biomedical Library Vanderbilt University Medical Center [email protected]

Upload: lyonja

Post on 28-Aug-2014

1.848 views

Category:

Technology


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Human Genome 2009

Jennifer Lyon, MS, MLISEskind Biomedical Library

Vanderbilt University Medical [email protected]

Page 2: Human Genome 2009

Begun formally in 1990, the U.S. Human Genome Project was a 13-year effort coordinated by the U.S. Department of Energy and the National Institutes of Health. Landmark papers detailing sequence and analysis of the human genome were published in the April 2003 issues of Nature and Science. Continued work has been done since to check, complete, describe, and understand the sequence.

The full sequence is freely available online to anyone with internet access.

Page 3: Human Genome 2009

What is the Human Genome?A genome is all the DNA contained in an organism

or a cell, which includes the chromosomes plus the DNA in mitochondria (and DNA in the chloroplasts of plant cells).

The Human Genome contains about 3.1 billion base-pairs of DNA, divided into 22 autosomal (non-sex) chromosomes and 2 sex-determining chromosomes (X and Y), as well as mitochondrial DNA.

See Video: http://www.genome.gov/25520211

Page 4: Human Genome 2009

Genome AssemblyThe initial assembly of the base-pair sequence is done

at the National Center for Biotechnology Information (NCBI) and then is openly shared.

Three major sites (NCBI, EMSEMBL, and UCSC) annotate (describe) the sequence and present it for public use.

The process used to assemble the contigs and annotate the sequence is complex and continuously being refined. The full NCBI process is described at http://www.ncbi.nlm.nih.gov/genome/guide/build.shtml

Page 5: Human Genome 2009

Summary of Genome AssemblyData Preparation

Input data includes both finished and draft genomic sequence data from GenBank

Data is screened for contamination by bacterial and viral sequences

Sequence is compared to other genomes, such as the mouse

Repeats are masked Contig Construction

Clone layout stageSequence building stage Melds of overlapping sequences formed and ordered Contigs are placed on a chromosome using sequence

overlaps with mapped STS markers and paired BAC-end sequences.

Page 6: Human Genome 2009

Genome AnnotationEach of the three major sites has its own process of

annotating the sequence – identifying the location of biologically significant elements of the sequence includingGenes and mRNA transcripts, plus associated protein

function informationVariation sites (Single Nucleotide Polymorphisms, etc.)Markers of various typesNon-coding RNAsRepeatsClones/Contigs (smaller sections created during the

sequencing process)

Page 7: Human Genome 2009

Maps and TracksLooking at 3.1 billion ATGCs doesn’t mean

much to the human eye. Mapping of the biologically-important

features is extremely important.Each map of a specific type of biological

feature is laid across the sequence like a road map. Maps can also be called ‘tracks’.

Maps vary in their unit of measurement, scale and resolution.

Multiple maps can be simultaneously viewed.

Page 8: Human Genome 2009

Why All These Maps?Evolution of mapping methods over time

Older maps created before the Human Genome Project (HGP)

Some maps were created for the HGP to help in the process of reassembling the sequence

Sequence-based maps could only be created after the HGP was completed

Maps produced by different groups or using different methods often show different types of map objectsdifferent subsets of map objects

Page 9: Human Genome 2009

Types of Mapscytogenetic maps (using chromosome band

numbers) genetic linkage maps (also called "genetic

maps") physical maps

radiation hybrid maps clone based maps (e.g., YAC map) sequence maps (based on the completed

sequence)

Page 10: Human Genome 2009

Cytogenetic MapsThese are the oldest type of maps and use the

light and dark bands that result from staining chromosomes with a dye. Dark bands have higher density of DNA, and therefore absorb more stain. These can viewed under a microscope.

Usually, something is hybridized (attached) to the chromosomes and labeled with a fluorescent or radioactive tag. The location is then identified microscopically based on the unique banding pattern of each chromosome

Page 11: Human Genome 2009

The pattern of bands on each chromosome is unique.

The detail of the banding pattern has increased as microscope power has increased over time.

Each human chromosome has a short arm ("p" for "petit") and long arm ("q" for "queue"), separated by a centromere. The ends of the chromosome are called telomeres. The ends of the chromosomes are labeled ptel and qtel. For example, the notation 7qtel refers to the end of the long arm of chromosome 7.

Page 12: Human Genome 2009

The cytogenetic bands are labeled p1, p2, p3,   q1, q2, q3, etc., counting from the centromere out toward the telomeres. At higher resolutions, sub-bands can be seen within the bands. The sub-bands are also numbered from the centromere out toward the telomere.

Page 13: Human Genome 2009

pedigree a simplified diagram of a family's genealogy that shows family members' relationships to each other and how a particular trait or disease has been inherited.

Page 14: Human Genome 2009

Scale for Genetic MapsScale:  centiMorgans (cM)A centiMorgan is a unit of genetic distance that

represents a 1% probability of recombination during meiosis.

If two genes are 1 cM apart, there is a 1% chance they will break apart during meiosis. If two genes are 20 cM apart, there is a 20% chance they will break apart during meiosis.

One cM is equivalent, on average, to a physical distance of approximately 1 megabase in the human genome. This is just an average because genetic recombination rates vary along different parts of the chromosomes.

Page 15: Human Genome 2009

Physical MapsThese have the highest resolution

The sequence-based maps are the best because the identify locations by exact base-pair coordinates

An older type of physical map is a ‘radiation hybrid map’ that is based on hitting chromosomes with radiation and measuring where things are relative to the random breakpoints. RH maps are static.

YAC, BAC or other clone maps identify the position of large cloned chunks of human genomic DNA relative to the complete chromosomes. These were mostly used during the reassembly of the complete sequence.

Page 16: Human Genome 2009

Sequence-Based MapsThese are the result of the Human Genome

Project – the ability to identify where biological features are in exact base-pair locations

More and more sequence-based maps are being developed

These are the definitive maps and are replacing the older map types

Scale is always in base-pairs (bp) though you may see kbp (kilobase-pairs = thousands of bps) and Mbp (megabase-pairs = millions of bps)

Page 17: Human Genome 2009

Accessing the Human GenomeNCBI – Map Viewer

http://www.ncbi.nlm.nih.gov/projects/mapview/map_search.cgi?taxid=9606

ENSEMBL http://www.ensembl.org/index.html

UCSC Genome Browserhttp://genome.ucsc.edu/

Page 18: Human Genome 2009

NCBI Map Viewer

Let’s Go Live: http://www.ncbi.nlm.nih.gov/

Page 19: Human Genome 2009

ENSEMBL

Let’s Go Live: http://www.ensembl.org

Page 20: Human Genome 2009

UCSC Genome Browser

Let’s Go Live: http://genome.ucsc.edu/

Page 21: Human Genome 2009

Practice Questions (1)Find one of the following genes in each of the three genome

browsers Endothelin-1 Cystic Fibrosis Transmembrane Conductance Regulator ADRB2 (adrenergic, beta-2-, receptor, surface)

Answer these questions about the gene What is known about the function of the protein this gene

produces? How many exons are in this gene? Find a human EST cluster. Is it conserved in other organisms? Are there any known repeat sequences within this gene? Locate some SNPs (single-nucleotide polymorphisms) in the gene.

If there is a coding SNP, identify the nucleotide change for it. Identify a sequence tagged site (STS) within this gene. What size

PCR product is made by the set of primers? What is the size of this gene in bps?

Page 22: Human Genome 2009

Practice Questions (2)Name four genes found at chromosomal location

11q13.1Locate the gene ACTN3. Go to the sequence view level.

Locate the mRNA start. Locate the first exon. What are the first five amino acids? How long is the first exon?

Find the human gene for Huntington’s Disease (HD). Display both the human and mouse genes simultaneously. In which chromosomal region for each organism are the homologous HD genes found? How similar/different are the mouse and human genes? Are any of the other genes nearby also possibly homologous between the human and the mouse?

Page 23: Human Genome 2009

Practice Questions (3)Display non-synonymous SNPs in the BRCA1

gene using the UCSC browser. Color them blue. Link to external data on one of these SNPs.

Can you use ENSEMBL or the NCBI MapViewer to see only non-synonymous SNPs in the BRCA1 gene like you just did with UCSC?