the human genome project at uc santa cruz

32
The Human Genome Project at UC Santa Cruz Phoenix Eagleshadow November 9, 2004

Upload: shelly-mccullough

Post on 31-Dec-2015

23 views

Category:

Documents


0 download

DESCRIPTION

The Human Genome Project at UC Santa Cruz. Phoenix Eagleshadow November 9, 2004. The Human Genome Project Began in 1990. The Mission of the HGP: The quest to understand the human genome and the role it plays in both health and disease. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Human Genome Project  at UC Santa Cruz

The Human Genome Project

at UC Santa CruzPhoenix Eagleshadow

November 9, 2004

Page 2: The Human Genome Project  at UC Santa Cruz

The Human Genome Project Began in 1990

• The Mission of the HGP: The quest to understand the human genome and the role it plays in both health and disease.

“The true payoff from the HGP will be the ability to better diagnose, treat, and prevent disease.” --- Francis Collins, Director of the HGP and the National Human Genome Research Institute (NHGRI)

Page 3: The Human Genome Project  at UC Santa Cruz

The genome is our Genetic Blueprint

• Nearly every human cell contains 23 pairs of chromosomes – 1 - 22 and XY or XX

• XY = Male• XX = Female

• Length of chr 1-22, X, Y together is ~3.2 billion bases (about 2 meters diploid)

Page 4: The Human Genome Project  at UC Santa Cruz

The Genome is Who We Are on the inside!

• Chromosomes consist of DNA – molecular strings

of A, C, G, & T – base pairs, A-T, C-G

• Genes– DNA sequences

that encode proteins

– less than 3% of human genome

Information coded in DNA

Page 5: The Human Genome Project  at UC Santa Cruz

CACACTTGCATGTGAGAGCTTCTAATATCTAAATTAATGTTGAATCATTATTCAGAAACAGAGAGCTAACTGTTATCCCATCCTGACTTTATTCTTTATG AGAAAAATACAGTGATTCCAAGTTACCAAGTTAGTGCTGCTTGCTTTATAAATGAAGTAATATTTTAAAAGTTGTGCATAAGTTAAAATTCAGAAATAAAACTTCATCCTAAAACTCTGTGTGTTGCTTTAAATAATCAGAGCATCTGC TACTTAATTTTTTGTGTGTGGGTGCACAATAGATGTTTAATGAGATCCTGTCATCTGTCTGCTTTTTTATTGTAAAACAGGAGGGGTTTTAATACTGGAGGAACAACTGATGTACCTCTGAAAAGAGA AGAGATTAGTTATTAATTGAATTGAGGGTTGTCTTGTCTTAGTAGCTTTTATTCTCTAGGTACTATTTGATTATGATTGTGAAAATAGAATTTATCCCTCATTAAATGTAAAATCAACAGGAGAATAGCAAAAACTTATGAGATAGATGAACGTTGTGTGAGTGGCATGGTTTAATTTGTTTGGAAGAAGCACTTGCCCCAGAAGATACACAATGAAATTCATGTTATTGAGTAGAGTAGTAATACAGTGTGTTCCCTTGTGAAGTTCATAACCAAGAATTTTAGTAGTGGATAGGTAGGCTGAATAACTGACTTCCTATC ATTTTCAGGTTCTGCGTTTGATTTTTTTTACATATTAATTTCTTTGATCCACATTAAGCTCAGTTATGTATTTCCATTTTATAAATGAAAAAAAATAGGCACTTGCAAATGTCAGATCACTTGCCTGTGGTCATTCGGGTAGAGATTTGTGGAGCTAAGTTGGTCTTAATCAAATGTCAAGCTTTTTTTTTTCTTATAAAATATAGGTTTTAATATGAGTTTTAAAATAAAATTAATTAGAAAAAGGCAAATTACTCAATATATATAAGGTATTGCATTTGTAATAGGTAGGTATTTCATTTTCTAGTTATGGTGGGATATTATTCAGACTATAATTCCCAATGAAAAAACTTTAAAAAATGCTAGTGATTGCACACTTAAAACACCTTTTAAAAAGCATTGAGAGCTTATAAAATTTTAATGAGTGATAAAACCAAATTTGAAGAGAAAAGAAGAACCCAGAGAGGTAAGGATATAACCTTACCAGTTGCAATTTGCCGATCTCTACAAATATTAATATTTATTTTGACAGTTTCAGGGTGAATGAGAAAGAAACCAAAACCCAAGACTAGCATATGTTGTCTTCTTAAGGAGCCCTCCCCTAAAAGATTGAGATGACCAAATCTTATACTCTCAGCATAAGGTGAACCAGACAGACCTAAAGCAGTGGTAGCTTGGATCCACTACTTGGGTTTGTGTGTGGCGTGACTCAGGTAATCTCAAGAATTGAACATTTTTTTAAGGTGGTCCTACTCATACACTGCCCAGGTATTAGGGAGAAGCAAATCTGAATGCTTTATAAAAATACCCTAAAGCTAAATCTTACAATATTCTCAAGAACACAGTGAA ACAAGGCAAAATAAGTTAAAATCAACAAAAACAACATGAAACATAATTAGACACACAAAGACTTCAAACATTGGAAAATACCAGAGAAAGATAATAAATATTTTACTCTTTAAAAATTTAGTTAAAAGCTTAAACTAATTGTAGAGAAAA AACTATGTTAGTATTATATTGTAGATGAAATAAGCAAAACATTTAAAATACAAATGTGATTACTTAAATTAAATATAATAGATAATTTACCACCAGATTAGATACCATTGAAGGAATAATTAATATACTGAAATACAGGTCAGTAGAATTTTTTTCAATTCAGCATGGAGATGTAAAAAATGAAAATTAATGCAAAAAATAAGGGCACAAAAAGAAATGAGTAATTTTGATCAGAAATGTATTAAAATTAATAAACTGGAAATTTGACATTTAAAAAAAGCATTGTCATCCAAGTAGATGTGTCTATTAAATAGTTGTTCTCATATCCAGTAATGTAATTATTATTCCCTCTCATGCAGTTCAGATTCTGGGGTAATCTTTAGACATCAGTTTTGTCTTTTATATTATTTATTCTGTTTACTACATTTTATTTTGCTAATGATATTTTTAATTTCTGACATTCTGGAGTATTGCTTGTAAAAGGTATTTTTAAAAATACTTTATGGTTATTTTTGTGATTCCTATTCCTCTATGGACACCAAGGCTATTGACATTTTCTTTGGTTTCTTCTGTTACTTCTATTTTCTTAGTGTTTATATCATTTCATAGATAGGATATTCTTTATTTTTTATTTTTATTTAAATATTTGGTGATTCTTGGTTTTCTCAGCCATCTATTGTCAAGTGTTCTTATTAAGCATTATTATTAAATAAAGATTATTTCCTCTAATCACATGAGAATCTTTATTTCCCCCAAGTAATTGAAAATTGCAATGCCATGCTGCCATGTGGTACAGCATGGGTTTGGGCTTGCTTTCTTCTTTTTTTTTTAACTTTTATTTTAGGTTTGGGAGTACCTGTGAAAGTTTGTTATATAGGTAAACTCGTGTCACCAGGGTTTGTTGTACAGATCATTTTGTCACCTAGGTACCAAGTACTCAACAATTATTTTTCCTGCTCCTCTGTCTCCTGTCACCCTCCACTCTCAAGTAGACTCCGGTGTCTGCTGTTCCATTCTTTGTGTCCATGTGTTCTCATAATTTAGTTCCCCACTTGTAAGTGAGAACATGCAGTATTTTCTAGTATTTGGTTTTTTGTTCCTGTGTTAATTTGCCCAGTATAATAGCCTCCAGCTCCATCCATGTTACTGCAAAGAACATGATCTCATTCTTTTTTATAGCTCCATGGTGTCTATATACCACATTTTCTTTATCTAAACTCTTATTGATGAGCATTGAGGTGGATTCTATGTCTTTGCTATTGTGCATATTGCTGCAAGAACATTTGTGTGCATGTGTCTTTATGGTAGAATGATATATTTTCTTCTGGGTATATATGCAGTAATGCGATTGCTGGTTGGAATGGTAGTTCTGCTTTTATCTCTTTGAGGAATTGCCATGCTGCTTTCCACAATAGTTGAACTAACTTACACTCCCACTAACAGTGTGTAAGTGTTTCCTTTTCTCCACAACCTGCCAGCATCTGTTATTTTTTGACATTTTAATAGTAGCCATTTTAACTGGTATGAAATTATATTTCATTGTGGTTTTAATTTGCATTTCTCTAATGATCAGTGATATTGAGTTTGTTTTTTTTCACATGCTTGTTGGCTGCATGTATGTCTTCTTTTAAAAAGTGTCTGTTCATGTACTTTGCCCACATTTTAATGGGGTTGTTTTTCTCTTGTAAATTTGTTTAAATTCCTTATAGGTGCTGGATTTTAGACATTTGTCAGACGCATAGTTTGCAAATAGTTTCTCCCATTCTGTAGGTTGTCTGTTTATTTTGTTAATAGTTTCTTTTGCTATGCAGAAGCTCTTAATAAGTTTAATGAGATCCTGATATGTTAGGCTTTGTGTCCCCACCCAAATCTCATCTTGAATTATATCTCCATAATCACCACATGGAGAGACCAGGTGGAGGTAATTGAATCTGGGGGTGGTTTCACCCATGCTGTTCTTGTGATAGTGAATGAGTTCTCACGAGATCTAATGGTTTTATGAGGGGCTCTTCCCAGCTTTGCCTGGTACTTCTCCTTCCTGCCGCTTTGTGAAAAAGGTGCATTGCGTCCCTTTCACCTTCTTCTATAATTGTAAGTTTCCTGAGGCCTTCCCAGCCATGCTGAACTTCAAGTCAATTAAACCTTTTTCTTTATAAATTACTCAGTCTCTGGTGGTTCTTTATAGCAGTGTGAAAATGGACTAATGAAGTTCCCATTTATGAATTTTTGCTTTTGTTGCAATTGCTTTTGACATCTTAGTCATGAAATCCTTGCCTGTTCTAAGTACAGGACGGTATTGCCTAGGTTGTCTTCCAGGGTTTTTCTAATTTTGTGTTTTGCATTTAAGTGTTTAATCCATCTTGAGTTGATTTTTGTATATTGTGTAAGGAAGGGGTCCAGTTTCAATCTTTTGCATATGGCTAGTTAGTTATCCCAGTACCATTTATTGAAAAGACAGTCTTTTCCCCATCGCTCGTTTTTGTCAGTTTTATTGATGATCAGATAATCATAGCTGTGTGGCTTTATTTCTGGGTTCTTTATTCTGTTCTATTGGTTTATGTCCCTGTTTTTGTGCCAGTACCATGCTGTTTTGGTTAACATAGCCCTGTAGTATAGTTTGAGGTCAGATAGCCTGATGCTTCCAGCTTTGTTCTTTTTCTTAAGATTGCCTTGGCTATTTGGCCTCTTTTTTGGTTCCACATGAATTTTAAAACAGTTGTTTCTAGTTTTTGAAGAATGTCATTGGTAGTTTGATAGAAATAGCATTTAATCTGTAAATTGATTTGTGCAGTATGGCCTTTTAATGATATTGATTCTTCCTATCCATGAGCATGATATGTTTTCCATTTTGTTTGTATCCTCTCTGATTTCTTTGTGCAGTGTTTTGTAATTCTCAT TGTAGAGATTTTTCACCTCCCTGGTTAGTTGTATTTTACCCTAGATATTT TATTCTTTTTGTGAAAATTGTGAATGGGATTGCCTTCCTGATTTGACTGC CAGCTTGGTTACTGTTGGTTTATAGAAATGCTAGTGATTTTTGTACATTG ATTTTCTTTCTAAAACTTTGCTGAAGTTTTTTTTATTAGCAGAAGGAGCTTTGGGGCTGAGACTATGGGGTTTTCTAGATATAGAATCATGTCAGCTTCAAATAGGGATAATTTTACTTCCTCTCTTCCTATTTGGATGCCCTTTATTTCTTTCTCTTGCCTGATTACTCTGGCTGGGATTTCCTATGTTGAATAGGAGT CATGAGAGAGGGCATCAAATCTACACATATCAAATACTAACCTTGAATGTCTAGATATTT TATTCTTTTTGTGAAAATTGTGAATGGGAT

5000 bases per page

Page 6: The Human Genome Project  at UC Santa Cruz

How much data make up the human genome?

• 3 pallets with 40 boxes per pallet x 5000 pages per box x 5000 bases per page = 3,000,000,000 bases!

• To get accurate sequence requires 6-fold coverage.

• Now: Shred 18 pallets and reassemble.

Page 7: The Human Genome Project  at UC Santa Cruz

The Beginning of the Project • Most the first 10 years of the project

were spent improving the technology to sequence and analyze DNA.

• Scientists all around the world worked to make detailed maps of our chromosomes and sequence model organisms, like worm, fruit fly, and mouse.

Page 8: The Human Genome Project  at UC Santa Cruz

UC Santa Cruz gets Involved

Computational biology (or Bioinformatics) is a research field that uses computers to help solve biological problems

Because of the work Professor David Haussler was doing in the field of computational biology, UC Santa Cruz was invited to participate in the HGP in late of 1999.

Page 9: The Human Genome Project  at UC Santa Cruz

The Tech Awards honors the UCSC Genome Bioinformatics Group in 2003!

Page 10: The Human Genome Project  at UC Santa Cruz

The Challenges were Overwhelming

• First there was the Assembly

The DNA sequence is so long that no technology can read it all at once, so it was broken into pieces.There were millions of clones (small sequence fragments).

The assembly process included finding where the pieces overlapped in order to put the draft together.

3,200,000 piece puzzle anyone?

Page 11: The Human Genome Project  at UC Santa Cruz

Assembly generated by UCSC

Freeze of sequence data generated by NCBI

Clone layouts generated By Washington University

ACCTTGGCCTGAATCTAGGCTTTGCATCCCTAGTCCTGATCG

sequence Clonemaps

Working draft assembly

The “Working Draft” of the human genome

Page 12: The Human Genome Project  at UC Santa Cruz

UCSC put the human genome sequence on the web July 7, 2000

UCSC put the human genome sequence on CD in October 2000,

with varying results

Cyber geeks Searched for hiddenMessages, and “GATTACA”

Page 13: The Human Genome Project  at UC Santa Cruz

The Completion of the Human Genome Sequence

• June 2000 White House announcement that the majority of the human genome (80%) had been sequenced (working draft).

• Working draft made available on the web July 2000 at genome.ucsc.edu.

• Publication of 90 percent of the sequence in the February 2001 issue of the journal Nature.

• Completion of 99.99% of the genome as finished sequence on July 2003.

Page 14: The Human Genome Project  at UC Santa Cruz

The Project is not Done…

• Next there is the Annotation: The sequence is like a topographical

map, the annotation would include cities, towns, schools, libraries and coffee shops!

So, where are the genes?

How do genes work?And, how do scientists use this information for scientific understanding and to benefit us?

Page 15: The Human Genome Project  at UC Santa Cruz

What do genes do anyway? • We only have ~27,000 genes, so that means

that each gene has to do a lot. • Genes make proteins that make up nearly all

we are (muscles, hair, eyes). • Almost everything that happens in our bodies

happens because of proteins (walking, digestion, fighting disease).

Eye Color and Hair Colorare determined by genes

OROR

Page 16: The Human Genome Project  at UC Santa Cruz

Of Mice and Men:It’s all in the genes

Humans and Mice have about the same number of genes. But we are so different from each other, how is this possible?

One human gene can make many

different proteins while a mouse gene can only make a few!

Did you say cheese?

Mmm, Cheese!

Page 17: The Human Genome Project  at UC Santa Cruz

Genes are important• By selecting different pieces of a gene,

your body can make many kinds of proteins. (This process is called alternative splicing.)

• If a gene is “expressed” that means it is turned on and it will make proteins.

Page 18: The Human Genome Project  at UC Santa Cruz

What we’ve learned from our genome so far…

• There are a relatively small number of human genes, less than 30,000, but they have a complex architecture that we are only beginning to understand and appreciate.

-We know where 85% of genes are in the sequence.

-We don’t know where the other 15% are because we haven’t seen them “on” (they may only be expressed during fetal development).

-We only know what about 20% of our genes do so far.

• So it is relatively easy to locate genes in the genome, but it is hard to figure out what they do.

Page 19: The Human Genome Project  at UC Santa Cruz

How do scientists find genes?

• The genome is so large that useful information is hard to find.

• Researchers at UCSC decided to make a computational microscope to help scientists search the genome.

• Just as you would use “google” to find something on the internet, researchers can use the “UCSC Genome Browser” to find information in the human genome.Explore it at http://genome.ucsc.edu

Page 20: The Human Genome Project  at UC Santa Cruz

The UCSC Genome Browser

Page 21: The Human Genome Project  at UC Santa Cruz

The browser takes you from early maps of the genome . . .

Page 22: The Human Genome Project  at UC Santa Cruz

. . . to a multi-resolution view . . .

Page 23: The Human Genome Project  at UC Santa Cruz

. . . at the gene cluster level . . .

Page 24: The Human Genome Project  at UC Santa Cruz

. . . the single gene level . . .

Page 25: The Human Genome Project  at UC Santa Cruz

. . . the single exon level . . .

Page 26: The Human Genome Project  at UC Santa Cruz

. . . and at the single base level

caggcggactcagtggatctggccagctgtgacttgacaag caggcggactcagtggatctagccagctgtgacttgacaag

Page 27: The Human Genome Project  at UC Santa Cruz

The Continuing Project• Finding the complete set of genes and

annotating the entire sequence. Annotation is like detailing; scientists annotate sequence by listing what has been learn experimentally and computationally about its function.

• Proteomics is studying the structure and function of groups of proteins. Proteins are really important, but we don’t really understand how they work.

• Comparative Genomics is the process of comparing different genomes in order to better understand what they do and how they work. Like comparing humans, chimpanzees, and mice that are all mammals but all very different.

Page 28: The Human Genome Project  at UC Santa Cruz

Who works on this stuff anyway?

• Biologists and Chemists understand the physical sciences-they take biology and chemistry classes.

• Computer Scientists program the computers (the same people who make video games!)-they take math and computer classes.

• Computer Engineers try to build better, faster, smarter computers-they take math, physics and computer classes.

• Social Scientists try to understand how this new information and technology will impact our lives-they take sociology and philosophy classes.

Page 29: The Human Genome Project  at UC Santa Cruz

UCSC Summer Workshop on Human Genome Research

• Held annually in July• It’s a free event for

students and teachers• Workshops by faculty and

researchers on a wide array of topics

• Tours of our laboratories and kilocluster

• Free breakfast and lunch• Travel funds are available• RSVP: 831-459-1702 or

[email protected]

Page 30: The Human Genome Project  at UC Santa Cruz

How can I work on this project, or something like it?

• Read about it, online at www.genome.gov, or in Nature, Science, or other scientific magazines.

• Take classes in biology, chemistry, math, physics and English classes at high school.

• OR take classes at your local community college or University-Extension in biology, bioinformatics, or genetics.

• Go to college and get a degree in science, engineering, math, or social sciences.

Page 31: The Human Genome Project  at UC Santa Cruz

Bioinformatics Opportunities

Entry-Level -CompanyNational LaboratoryTeaching – Private Schools

BS (BA)

MS (MA)Research Staff -Company/UniversityNational LaboratoryResearch FoundationTeaching -Community CollegePublic Schools

PhD Director/Professor -UniversityCompanyNational LaboratoryResearch Foundation

BioinformaticsBiochemistryBiologyComputer ScienceComputer EngineeringMathematicsOcean SciencesPhysics(Education, Sociology, Philosophy, Psychology, Community Studies)

A research degree in any of these majors will take you far!

Page 32: The Human Genome Project  at UC Santa Cruz

Thank you for letting us come talk to you today and

share what we do!

Bye!Come to

UCSC, Slugs are cool!