grades distribution - ksufac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · grades...
TRANSCRIPT
![Page 1: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/1.jpg)
BioinformaticsPrepared by: Nojood
Altwaijry
Office: Building 5, 3rd floor,
197
![Page 2: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/2.jpg)
Grades distribution
Midterm 1: 15 marks
Midterm 2: 15 marks
Lab: 30 marks
Final: 40 marks
![Page 3: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/3.jpg)
References
Bioinformatics and functional Genomics
![Page 4: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/4.jpg)
Exam dates
Midterm 1: Sunday 5th February (30
Jumada’ I)
Midterm 2: Sunday 6th March (29 Jumada’
II)
From 12-1 pm
![Page 5: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/5.jpg)
General Objectives
1) Understand the meaning of
bioinformatics.
2) How to be a confidant user of the
available tools and websites related.
![Page 6: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/6.jpg)
What is Bioinformatics?
It is a computational approach to solve a
biological problem.
i.e. the use of computers for the acquisition,
management and analysis of biological
information
![Page 7: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/7.jpg)
What is Bioinformatics
cont’ed?
It is a multi-disciplinary field including:
computer systems management
networking, database design, computer
programming, molecular biology.
It draws upon the strengths of computer
sciences, mathematics, and information
technology to determine and analyze
genetic information.
![Page 8: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/8.jpg)
Bioinformatics and Other Fields
Bioinformatics
StatisticsMathematics
Molecular
Biology
Machine
Learning
Data
Mining
AI
Software
Engineering
Database
Networks
![Page 9: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/9.jpg)
Insilico areas of bioinformatics
Computational biology
Comparative homology modeling
Phylogenetic analysis
Protein structure prediction
Protein folding prediction
Micro array analysis
Docking approaches &new drug discovery
![Page 10: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/10.jpg)
Applications of Bioinformatics
Molecular, personal and preventative
medicine.
Gene therapy.
Drug development.
Microbial genome applications.
Climate change studies.
![Page 11: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/11.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
Short History
Electrophoresis
A new technique, electrophoresis, is introduced by Tiselius for
separating proteins in solution.
![Page 12: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/12.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
Short History
Double helix model
for DNA proposed
Watson and Crick propose the double helix model for DNA based on
x-ray data obtained by Franklin and Wilkins (Nature, 171: 737-738,
1953).
![Page 13: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/13.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
Short History
InsulinThe sequence of the first protein to be
analyzed, bovine insulin, is announced (51
residues) by Sanger
![Page 14: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/14.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
Short History
Atlas of Protein
Sequence and
Structure book
In the mid-1970s, it would take a laboratory at least two months
to sequence 150 nucleotides.
1965 Margaret Oakley Dayhoff (mother and father of bioinformatics)
the Atlas of Protein Sequence and Structure, a book collecting all
known protein sequences+ pioneered the use of computer capabilities
to determine amino acid sequences of protein molecules.
![Page 15: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/15.jpg)
“There is a tremendous amount of information
regarding the evolutionary history and
biochemical function implicit in each sequence
and the number of known sequences is growing
explosively. We feel it is important to collect this
significant information, correlate it into a unified
whole and interpret it”
M.O.Dayhoff to C.Berkley, February 27, 1967
Margaret Dayhoff
T. K. Attwood, A. Gisel, N.-E. Eriksson, and E. Bongcam-Rudloff, “Concepts, Historical Milestones and the
Central Place of Bioinformatics in Modern Biology: A European Perspective,” in Bioinformatics - Trends
and Methodologies, M. A. Mahdavi, Ed. InTech, 2011.
![Page 16: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/16.jpg)
A Short History
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
Needleman-
Wunsch
1970 the Needleman-Wunsch algorithm for sequence comparison are
published.
![Page 17: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/17.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
A Short History
Genbank opens
![Page 18: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/18.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
A Short History
Yeast genome
1997 The genome for E. coli (4.7 Mbp) is published.
4,700,000
![Page 19: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/19.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
A Short History
Human genome 3000,000,000 bp
![Page 20: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/20.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
A Short HistoryNextGen
Sequencing
Next-generation sequencing (NGS), also known as high-throughput
sequencing, is the catch-all term used to describe a number of
different modern sequencing technologies including:
• Illumina (Solexa) sequencing
• Roche 454 sequencing
• Ion torrent: Proton / PGM sequencing
• SOLiD sequencing
These recent technologies allow us to sequence DNA and RNA much
more quickly and cheaply than the previously used Sanger sequencing,
and as such have revolutionised the study of genomics and molecular
biology.
![Page 21: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/21.jpg)
1933
1953
1955
1965
1970
1982
1997
2001
2005
2011
A Short History
Third Gen
Sequencing
![Page 22: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/22.jpg)
What is sequencing? Sequencing: telling the order of bases in a given biological
sequence. Unfortunately, we do not have technologies that can take
a DNA and determine its sequence from one end to another. So, what happens is that DNA is chopped and put back
together.
2001 The human genome (3,000 Mbp) is published: taking roughly 10 years and three billion dollars.
Using 1st generation sequencing. This method results in a read length that is ∼800 bases on
average, but may be extended to above 1000 bases
Small amounts of DNA could be processed per unit time (throughput), as well as high cost, resulted in it taking roughly 10 years and three billion dollars to sequence the first human genome.
http://hmg.oxfordjournals.org/content/19/R2/R227.full
![Page 23: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/23.jpg)
Speed of DNA Sequencing
Technologies
Source: http://www.yourgenome.org/stories/third-generation-sequencing
A graph showing how the speed of DNA sequencing technologies has
increased since the early techniques in the 1980s.
Image credit: Genome Research Limited
![Page 24: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/24.jpg)
Human Genome Project
![Page 25: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/25.jpg)
Human Genome Project
Construct a high resolution genetic map of the
human genome
Produce physical maps of all chromosomes
Determine genome sequence of human and
other model organisms
Develop capabilities (technologies) for
collecting, storing, distributing and analyzing
data
![Page 26: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/26.jpg)
International Human
Genome Sequencing
Consortium:
Lunched in 1990
20 governmental groups
Public project
Last chromosome sequence published in 2006
Involved a small number of individual, finished project is a mosaic not representing one individual
Craig Venter & Celera Genomics:
Founded 1998.
Sequence in 3 years
Technology: automation,
computers
Had access to public
project’s data
Used shotgun strategy
Mostly of Craig Venter
Background
Published in Science and Nature.
![Page 27: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/27.jpg)
International Human Genome
Sequencing Consortium
Approach was conservative and methodical.
First produced a clone-based physical map of the genome
Broke genome into chunks of DNA whose position on chromosome
was known from maps, clone into bacteria using bacterial artificial
chromosome (BACs).
Digest BAC-inserted clonal chunks of DNA into small fragments.
Sequence small fragments.
Stitch together BAC clones to assemble sequence.
Assemble genome sequence from BAC clone sequences, using
clone-based physical map.
Sequenced only euchromatic regions of the genome.
![Page 28: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/28.jpg)
Euchromatin Region
It is a lightly packed form of chromatin (DNA, RNA, and protein) that is enriched in genes, and is often (but not always) under active transcription.
Euchromatin comprises the most active portion of the genome within the cell nucleus. 92% of the human genome is euchromatic.
![Page 29: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/29.jpg)
Celera
Approach using "shotgun sequencing" (no organized map).
Shreds genome randomly into small fragments with no idea of where they are physically located.
Clones and sequences fragments.
Uses computer to stitch together genome by matching overlapping ends of sequenced fragments.
![Page 30: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/30.jpg)
Human genome timeline
Genome sequencing
driven by technology.
Year 1985: 500 base pairs
per day by hand.
Year 1985-86: PCR and
automated DNA
sequencing.
Year 2000: 1000 bases per
second.
![Page 31: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/31.jpg)
Coding vs. Noncoding DNA
< 2% of genome actually encodes protein
> 98% of genome is non-coding
More junk DNA
Fewer genes!!!!
![Page 32: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/32.jpg)
Coding sequences(Protein coding genes)
WikiHuman genes categorized by function of the transcribed proteins
Number of encoding genes
![Page 33: Grades distribution - KSUfac.ksu.edu.sa/sites/default/files/intro_bioinformatics.pdf · Grades distribution Midterm 1: 15 marks Midterm 2: 15 marks Lab: 30 marks Final: 40 marks](https://reader034.vdocuments.net/reader034/viewer/2022042315/5f03d1147e708231d40aea0e/html5/thumbnails/33.jpg)
Access to Information
All public project data on the Internet.
NCBI Website: www.ncbi.nlm.nih.gov.
Human genome database.
Sequence and mapping tools.