introduction to bioinformatics - shandong university · 7 introduction to bioinformatics english...
TRANSCRIPT
![Page 1: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/1.jpg)
1
Introduction to BioinformaticsEnglish Courses for Graduate Students
Dr. rer. nat. Jing Gong
Cancer Research center
Medicine School of Shandong University
2011.9.21
Introduction to BioinformaticsEnglish Courses for Graduate Students
![Page 2: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/2.jpg)
2
Chapter 2
Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
![Page 3: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/3.jpg)
3
Introduction to BioinformaticsEnglish Courses for Graduate Students
Why Do “WE” Need Databases?
What’s that?
gcattacttgatctaatca
ataggatctaatctttactagaacgccttgatctaatca
ttgcaa
![Page 4: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/4.jpg)
4
Introduction to BioinformaticsEnglish Courses for Graduate Students
Why Do We Need Databases?
gcattacttgatctaatca
ataggatctaatctttactagaacgccttgatctaatca
ttgcaa
This is the entire HIV1 genome containing total 9752 nucleotides with only 9 genes.
![Page 5: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/5.jpg)
5
Introduction to BioinformaticsEnglish Courses for Graduate Students
Why Do We Need Databases?Human Genome : 3 Gbp = 3,000,000,000 bp
5000bp/page 600pages/book 1000 x 3cm/book600,000 pages 1000 books = 30m bookshelf
Over 1000 species :
1000 x 30m-bookshelves
200 x 5 layers/bookshelf
医学院图书馆 450,000 册
26.6m
= 2 x
![Page 6: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/6.jpg)
6
Introduction to BioinformaticsEnglish Courses for Graduate Students
Why Do We Need Databases?10cm
14.6
cm
x 1000
All sequenced genomes:
26.6m
1TB
= 1000GB
= 1,000,000MB
= 1,000,000,000KB
= 1.000,000,000,000B
collect access
manage
update
![Page 7: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/7.jpg)
7
Introduction to BioinformaticsEnglish Courses for Graduate Students
Biological Databases - A biological database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated.
History of Biological Databases
insulin = MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN
Frederick Sanger (1918-)nobel prize 1958
The first biological database was created within a short period after the Insulin protein sequence was made available in 1956.
Biological databases make it possible to answer today’s biological questions by enabling us to analyze sequences that may have been determined as many as 30 years ago, when the whole technology emerged.
![Page 8: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/8.jpg)
8
Introduction to BioinformaticsEnglish Courses for Graduate Students
History of Biological DatabasesAround mid 1960’s, the first nucleic acid sequence of Yeast tRNA with 77 bases was found out by Holley. During this period, three dimensional structure of proteins were studiedand the well known Protein Data Bank was developed as the first protein structure database with only 10 entries in 1972. This has now grown into a large database with over 75,000 entries.
Robert W. Holley (1922-1993)
nobel prize 1968
John CowderyKendrew (1917-1997)
nobel prize 1962
Max Ferdinand Perutz (1914-2002)
nobel prize 1962
![Page 9: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/9.jpg)
9
Introduction to BioinformaticsEnglish Courses for Graduate Students
History of Biological DatabasesAt beginning, the initial databases of protein sequences were maintained at the individual laboratories. The development of a consolidated formal database known as Swiss-Prot protein sequence database was initiated in 1986. Now it has about 530,000 protein sequences from more than 12,000 organisms.
![Page 10: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/10.jpg)
10
Introduction to BioinformaticsEnglish Courses for Graduate Students
History of Biological DatabasesThe Los Alamos National Laboratory (USA) established the Los Alamos Sequence Database in 1979, which culminated in 1982 with the creation of the public GenBank. Later, the nucleotide sequence database of European Molecular Biology Laboratory (EMBL) and the DNA Data Bank of Japan (DDBJ) were created. Today these three databases represent the largest and fundamental biological databases and they together called International Nucleotide Sequence Database Collaboration (INSDC).
![Page 11: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/11.jpg)
11
Protein Structure DB
Introduction to BioinformaticsEnglish Courses for Graduate Students
Classification of Biological DatabasesProtein Database
Primary Protein Database
INSDCUniProt
Protein Sequence DB
Secondary Protein Database
Nucleotide DatabasePrimary Nucleotide Database
Secondary Nucleotide Database
Specific Database
>2000
![Page 12: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/12.jpg)
12
Protein Structure DB
Introduction to BioinformaticsEnglish Courses for Graduate Students
Classification of Biological DatabasesProtein Database
Primary Protein Database
INSDCUniProt
Protein Sequence DB
Secondary Protein Database
Nucleotide DatabasePrimary Nucleotide Database
Secondary Nucleotide Database
Specific Database
>2000
![Page 13: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/13.jpg)
13
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesPrimary Nucleotide Database
is produced and maintained by the National Center for Biotechnology Information ( NCBI ). The NCBI is a part of the National Institutes of Health ( ) in the United States ( ).
GenBank receive sequences produced in laboratories throughout the world. In about30 years since its establishment, Its data were accessed and cited by millions of researchers around the world. GenBankcontinues to grow at an exponential rate, doubling every 18 months. Release producedin 2008, contained over 99 billion nucleotidebases in more than 98 million sequences.
![Page 14: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/14.jpg)
14
The European Molecular Biology Laboratory (EMBL) is supported by 20 European ( ) member states and one associate member state, Australia ( ). It consists of five facilities: the main Laboratory in Heidelberg ( ), outstations in Hinxton ( , the European Bioinformatics Institute (EBI)), Hamburg ( ), Grenoble ( ), and Monterotondo ( ).
The EMBL Nucleotide Sequence Database constitutes Europe'sprimary nucleotide sequence resource. Main sources for DNA
and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. The current release contains 212 millions sequence entries comprising 326 billions nucleotides.
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesPrimary Nucleotide Database
![Page 15: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/15.jpg)
15
The DNA Data Bank of Japan (DDBJ) is a biological database that collects DNA sequences. It is located at the National Institute of Genetics ( ) in the Shizuoka (静冈) of Japan.
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
DDBJ began data bank activities in 1986 at NIG and remains the only nucleotide sequence data bank in Asia. Although DDBJ mainly receives its data from Japanese researchers, it can accept data from contributors from any other country. The current release contains 138 millions sequence entries and 128 billions bases.
Primary Nucleotide Database
![Page 16: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/16.jpg)
16
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
The International Nucleotide Sequence Database Collaboration (INSDC) consists of a joint effort to collect and disseminate databases containing DNA and RNA sequences
New and updated data on nucleotide sequences contributed by research teams to each of the three databases are synchronized on a daily basis through continuous interaction between the staff at each the collaborating organizations.
Primary Nucleotide Database
![Page 17: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/17.jpg)
17
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and GenomesAll living organisms can be sorted into one of two groups depending on the fundamental structure of their cells. These two groups are the prokaryotes (organisms lacking a true nucleus) and the eukaryotes (organisms having a true nucleus).
Nucleotide sequences are universal, but the structure of genes they encode is markedly different between prokaryotes and eukaryotes.
archaea prokaryotic cell eukaryotic cell
![Page 18: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/18.jpg)
18
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and GenomesBesides prokaryotes and eukaryotes, there is the third class of living organisms, archaea. They are bacteria-like organisms living in extreme conditions. In bioinformatic context, prokaryotes and archaea are very much the same.
archaea prokaryotic cell eukaryotic cell
![Page 19: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/19.jpg)
19
• They are microscopic organisms.• Their genomes is single, circular DNA molecule.• Their gene density is approximately one gene per 1,000 base pairs.• Their genome contains few useless
part (70% is coding for proteins).• Their genes do not overlap.• Their genes are transcribed to mRNA
right after a control region, called promoter.• These mRNA are collinear with the genome sequence.• Protein sequences are derived by translating the longest open reading frame
(ORF) spanning the gene-transcript sequence.
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and GenomesProkaryotes (archaea) have the following properties in common:
![Page 20: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/20.jpg)
20
= x 6 reading frames
Reading Frame - a reading frame is a way of breaking a DNA sequence into three letter codons which can be translated in amino acids.
x 3 x 3
ATG Met (M)
TAA TAG TGA
ORF (Open Reading Frame) - a DNA sequence that contains a start codon but does not contain a stop codon in a given reading frame.
ORF
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesNucleotide DatabasesReading into Genes and Genomes
![Page 21: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/21.jpg)
21
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and GenomesProkaryotes (archaea) have the following properties in common:
• They are microscopic organisms.• Their genomes is single, circular DNA molecule.• Their gene density is approximately one gene per 1,000 base pairs.• Their genome contains few useless
part (70% is coding for proteins).• Their genes do not overlap.• Their genes are transcribed to mRNA
right after a control region, called promoter.• These mRNA are collinear with the genome sequence..• Protein sequences are derived by translating the longest open reading frame
spanning the gene-transcript sequence.
![Page 22: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/22.jpg)
22
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and Genomes
According to these common properties, database entries describing a coding prokaryotic sequence should include three important features:
• The coordinates of some promoter elements• The coordinates of the RBS• The coordinates of the ORF boundaries.
Not all genes encode proteins. For some of them, the function is directly carried out by the transcribed RNA molecule, including tRNA, rRNA and a few others.
![Page 23: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/23.jpg)
23
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and GenomesEukaryotes have the following properties in common:
• Their genome consists of multiple linear pieces of DNA called chromosomes.• Their genome size is much bigger than in prokaryotes.• Their gene density is much lower than that for prokaryotes.• Their genome is not efficient, containing many useless parts .• Genes on opposite DNA strands might overlap, although that’s a relatively rare
occurrence. • Their genes are transcribed right after a control region called a promoter, but
sequence elements located far away can have a strong influence on this process. • Gene sequences are not collinear with the final messenger RNA (mRNA) and
protein sequences. Only small bits (the exons) are retained in the mature mRNA that encodes the final product.
![Page 24: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/24.jpg)
24
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesReading into Genes and GenomesA few points between prokaryotes and eukaryotes:
yesnoHas mRNA introns?noyesIs gene collinear?5%70%cording region content
one gene / 1,000 bp0.5-91 million bpProkaryotes
10–670,000 million bpgenome sizeOne gene / 100,000 bp (human)gene density
Eukaryotes
Pro.
Eu.
![Page 25: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/25.jpg)
25
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
X017141
2 3
http://www.ncbi.nlm.nih.gov/X01714
![Page 26: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/26.jpg)
26
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
1
2
3
X01714http://www.ncbi.nlm.nih.gov/
![Page 27: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/27.jpg)
27
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
X01714http://www.ncbi.nlm.nih.gov/
![Page 28: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/28.jpg)
28
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
LOCUS gives us the locus name, the size of the nucleotide sequence in base pairs, the nature of the molecule, its topology and the last updated date.
http://www.ncbi.nlm.nih.gov/X01714
http://www.ncbi.nlm.nih.gov/X01714
![Page 29: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/29.jpg)
29
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
DEFINITION provides a short definition of the gene that corresponds to the entry sequence. Here, it’s the E. coli dut gene. This gene can encode the enzyme dUTPase. The full name of dUTPase is deoxyuridine 5’- triphosphate nucleotidohydrolase(脱氧尿苷焦磷酸酶).
http://www.ncbi.nlm.nih.gov/X01714
![Page 30: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/30.jpg)
30
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
ACCESSION lists the accession number - a unique identifier within and across various databases. Here, the accession number is X01714.
http://www.ncbi.nlm.nih.gov/X01714
![Page 31: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/31.jpg)
31
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
VERSION fills you in on synonymous or past ID numbers.
X01714http://www.ncbi.nlm.nih.gov/
![Page 32: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/32.jpg)
32
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
KEYWORDS introduces a list of terms that broadly characterize the entry. You can use these terms as keywords for certain database searches.
X01714http://www.ncbi.nlm.nih.gov/
![Page 33: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/33.jpg)
33
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
SOURCE reveals the common name of the relevant organism to which the sequence belongs.
http://www.ncbi.nlm.nih.gov/X01714
![Page 34: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/34.jpg)
34
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
ORGANISM gives a more complete identification of the organism, complete with its technical taxonomic classification.
X01714http://www.ncbi.nlm.nih.gov/
![Page 35: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/35.jpg)
35
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
REFERENCE introduces a section where the credits for the sequence determination are given (different parts of the sequences can be credited to different authors). The REFERENCE section contains multiple parts: AUTHORS, TITLE, JOURNAL, and PUBMED.
X01714http://www.ncbi.nlm.nih.gov/
![Page 36: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/36.jpg)
36
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
COMMENT contains free-formatted text, such as acknowledgments or information that doesn’t fit in the previous sections.
X01714http://www.ncbi.nlm.nih.gov/
![Page 37: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/37.jpg)
37
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
FEATURES describes the gene regions and the associated biological properties that have been identified in the nucleotide sequence. This entire section is under the control of the FEATURES keyword, such as source, promoter, etc.
X01714http://www.ncbi.nlm.nih.gov/
![Page 38: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/38.jpg)
38
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
source indicates the origin of specific regions of the sequence. This is useful when you want to distinguish cloning vectors from host sequences.In X01714, the whole sequence comes from E. coli genomic DNA.
X01714http://www.ncbi.nlm.nih.gov/
![Page 39: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/39.jpg)
39
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
promoter shows the coordinates of a promoter element. In X01714, a -35 region is indicated from position 286 to 291 in the nucleotide sequence.
X01714http://www.ncbi.nlm.nih.gov/
![Page 40: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/40.jpg)
40
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
misc feature (miscellaneous feature) indicates the putative location of the transcription start (mRNA synthesis). For X01714, this is from positions 322 to 324.
X01714http://www.ncbi.nlm.nih.gov/
![Page 41: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/41.jpg)
41
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
RBS (Ribosome Binding Site) indicates the location of the last upstream element. For X01714, this is at position 330 to 333.
X01714http://www.ncbi.nlm.nih.gov/
![Page 42: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/42.jpg)
42
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
CDS (CoDing Segment) introduces a complex section that describes the gene’s open reading frame (ORF).
X01714http://www.ncbi.nlm.nih.gov/
![Page 43: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/43.jpg)
43
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
The first line indicates the coordinates of the ORF from its initial ATG to the last nucleotide of the first stop codon TAA.
X01714http://www.ncbi.nlm.nih.gov/
![Page 44: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/44.jpg)
44
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
Each of the following lines gives the name of a protein product, indicates the reading frame to use(here, 343 is the first base of the first codon), the genetic code to apply, and a number of IDs for the protein sequence.
X01714http://www.ncbi.nlm.nih.gov/
![Page 45: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/45.jpg)
45
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
/translation introduces the conceptual amino-acid sequence of the coding segment. This sequence is a computer translation that uses the coordinates,reading frame, and genetic code indicated in the preceding lines.
X01714http://www.ncbi.nlm.nih.gov/
![Page 46: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/46.jpg)
46
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
misc feature contains lines that point out putative stem-loop structures and repeats. These are potential regulatory elements of the dUTPase gene.
X01714http://www.ncbi.nlm.nih.gov/
![Page 47: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/47.jpg)
47
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
This entry exhibits an extra putative gene, indicatedby an additional RBS element and a second CDS section. GenBank entries containing more than one gene are frequent.
X01714http://www.ncbi.nlm.nih.gov/
![Page 48: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/48.jpg)
48
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
The last section is the nucleotide sequence section.It starts with the ORIGIN keyword and finishes with the end-of-entry line introduced by two slash marks (//). Each line of nucleotide sequence starts withthe position number of the first nucleotide in that line. Each line contains 60 nucleotides.
X01714http://www.ncbi.nlm.nih.gov/
![Page 49: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/49.jpg)
49
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of a prokaryotic gene: E. coli dUTPase
1. Way
2. Way
X01714http://www.ncbi.nlm.nih.gov/
![Page 50: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/50.jpg)
50
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
U902231
2 3
Making sense of GenBank entry of an eukaryotic mRNA: human dUTPase
U90223http://www.ncbi.nlm.nih.gov/
![Page 51: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/51.jpg)
51
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic mRNA: human dUTPase
U90223http://www.ncbi.nlm.nih.gov/
![Page 52: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/52.jpg)
52
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
A common problem in sequence databases: annotations may be incomplete.
A word to the wise: You should never expect GenBank (or any sequence database) annotations to be up-to-date.
Making sense of GenBank entry of an eukaryotic mRNA: human dUTPase
U90223http://www.ncbi.nlm.nih.gov/
![Page 53: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/53.jpg)
53
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
In the FEATURES section, the CDS indicates a coding region (63-821) sequence that corresponds to the mitochondrial form of human dUTPase, following the conceptual amino-acid translation of the ORF.
Making sense of GenBank entry of an eukaryotic mRNA: human dUTPase
U90223http://www.ncbi.nlm.nih.gov/
![Page 54: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/54.jpg)
54
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
The sig peptide keyword indicates the location of a mitochondrial targeting sequence, and the mat peptide keyword provides the exact boundaries of the mature peptide.
Making sense of GenBank entry of an eukaryotic mRNA: human dUTPase
U90223http://www.ncbi.nlm.nih.gov/
![Page 55: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/55.jpg)
55
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430
1
2 3AF018430
http://www.ncbi.nlm.nih.gov/
![Page 56: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/56.jpg)
56
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430
This specifies that the entry encompasses exon 3 of the gene.
http://www.ncbi.nlm.nih.gov/
![Page 57: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/57.jpg)
57
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430
SEGMENT indicates that this current GenBank entry is the second segment of a super entry made of four. You need all four entries to reconstruct the complete mRNA sequence used as a template for producing the protein.
http://www.ncbi.nlm.nih.gov/
![Page 58: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/58.jpg)
58
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430
The /map in the source section indicatesthat the sequence belongs to chromosome 15, and was more precisely mapped on the long arm (q) of this chromosome, within the q21.1 cytogenetic band.
http://www.ncbi.nlm.nih.gov/
![Page 59: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/59.jpg)
59
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430
The gene keyword introduces complex-looking formulas. Their purpose is to describe precisely the reconstruction of the various mRNAs spread over several separate entries.
The < at the beginning of the formula indicates that the gene might actually start before the indicated position,
the > at the end of the formula indicates that the gene might actually continue beyond the indicated position.
http://www.ncbi.nlm.nih.gov/
![Page 60: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/60.jpg)
60
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430http://www.ncbi.nlm.nih.gov/
![Page 61: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/61.jpg)
61
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430http://www.ncbi.nlm.nih.gov/
![Page 62: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/62.jpg)
62
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMaking sense of GenBank entry of an eukaryotic genomic entry
AF018430
The exon keyword indicates the position of the sole exon present in this sequence.
http://www.ncbi.nlm.nih.gov/
![Page 63: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/63.jpg)
63
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases
DUT [gene] human [organism]
1
23
Using a Gene-Centric Databasehttp://www.ncbi.nlm.nih.gov/DUT [gene] human [organism]
![Page 64: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/64.jpg)
64
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesUsing a Gene-Centric Database
The top of the entry provides a general description of what this gene is all about and what function its products are known to perform, as well as a large variety of links to other databases or NCBI files.
http://www.ncbi.nlm.nih.gov/DUT [gene] human [organism]http://www.ncbi.nlm.nih.gov/DUT [gene] human [organism]http://www.ncbi.nlm.nih.gov/
![Page 65: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/65.jpg)
65
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesUsing a Gene-Centric Database
A schematic view of theHuman DUT gene structure
DUT [gene] human [organism]http://www.ncbi.nlm.nih.gov/
![Page 66: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/66.jpg)
66
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesUsing a Gene-Centric Database
Other sections provide information on potential interactions with other gene products, protein functions, a list of all corresponding sequence entries in GenBank and a large variety of links to other databases or NCBI files, etc.
DUT [gene] human [organism]http://www.ncbi.nlm.nih.gov/
![Page 67: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/67.jpg)
67
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 68: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/68.jpg)
68
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 69: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/69.jpg)
69
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 70: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/70.jpg)
70
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 71: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/71.jpg)
71
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 72: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/72.jpg)
72
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
Global summary of the HIV-1 genome
This clickable picture indicates the identity and respective positions of all the genes.
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 73: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/73.jpg)
73
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1A live map - allows you to zoom
in/out on any genome region, down to the nucleotide sequence level.
Viruses commonly have the same nucleotide sequence involved in the making of two different amino-acid sequences
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 74: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/74.jpg)
74
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 75: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/75.jpg)
75
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete viral genomes: HIV-1
The Protein List page - you canretrieve the DNA and protein sequences in different formats, GenBank format, or FASTA format.
http://www.ncbi.nlm.nih.gov/HIV-1
![Page 76: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/76.jpg)
76
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases http://www.tigr.org/
Working with complete bacterial genomes at TIGR
The Institute for Genome Research (TIGR) is home to a team of scientists who pioneered the field of bacterial genomics.
TIGR is founded in 1992 by Craig Venter and is now a part of the J. Craig Venter Institute.
In 1995, the scientists of TIGR produced the two first complete bacterial genomes. Since then, they have contributed to more than 700 complete bacterial genomes, with more on the way.
TIGR offers a site that is quite complementary to the NCBI resource because it keeps track of all ongoing bacterial genome sequencing projects (not only of the completed ones).
TIGR home page: http://www.tigr.org/ http://www.jcvi.org/
![Page 77: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/77.jpg)
77
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases http://www.tigr.org/
Working with complete bacterial genomes at TIGR
![Page 78: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/78.jpg)
78
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases http://www.tigr.org/
Working with complete bacterial genomes at TIGR
![Page 79: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/79.jpg)
79
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete bacterial genomes at TIGR
http://cmr.tigr.org/
![Page 80: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/80.jpg)
80
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete bacterial genomes at TIGR
http://cmr.tigr.org/
![Page 81: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/81.jpg)
81
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesWorking with complete bacterial genomes at TIGR
http://cmr.tigr.org/
![Page 82: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/82.jpg)
82
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases http://img.jgi.doe.gov/Microbes from the environment at DoE
The U.S. Department of Energy (DoE) is also a main player in microbial genomics. Its Joint Genome Institute specializes in the study of organisms that are either
(a) important for preserving our environment, or
(b) offering some new perspective in solving the incoming worldwide energy crisis (such as cheap ways of producing hydrogen).
DoE home page: http://img.jgi.doe.gov/
![Page 83: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/83.jpg)
83
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMicrobes from the environment at DoE
http://img.jgi.doe.gov/
![Page 84: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/84.jpg)
84
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMicrobes from the environment at DoE
http://img.jgi.doe.gov/
![Page 85: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/85.jpg)
85
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMicrobes from the environment at DoE
1
2
3
http://img.jgi.doe.gov/
![Page 86: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/86.jpg)
86
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases http://img.jgi.doe.gov/
![Page 87: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/87.jpg)
87
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMicrobes from the environment at DoE
http://img.jgi.doe.gov/
![Page 88: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/88.jpg)
88
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMicrobes from the environment at DoE
http://img.jgi.doe.gov/
![Page 89: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/89.jpg)
89
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesMicrobes from the environment at DoE
A live display of the microbe gene content in the range 1 to 500,000.
http://img.jgi.doe.gov/
![Page 90: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/90.jpg)
90
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome
Human Genome – 3 billion nucleotides spread over 23 chromosomes.
If you want to make sense of human data, you must have clear ideas on thecurrent state of the data:
• The complete nucleotide sequence of the human genome is now at hand.
• This sequence was obtained in raw format; the next challenge is the annotation of the raw data, creating a detailed and accurate FEATURES table of the human genome.
• Throughout the world, new information is generated daily on human gene properties and functions, using a wide array of techniques.
![Page 91: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/91.jpg)
91
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide Databases http://www.ensembl.org/Exploring the Human Genome : the Internet home page of Ensembl
Ensembl is a joint project between the European Bioinformatics Institute (EBI) and the Sanger Institute.
Together they’ve developed an integrated database and software system to produce and maintain automatic annotations for the genomes of animals with a special attention to our closest relatives: the vertebrates.
Ensembl home page: http://www.ensembl.org/
![Page 92: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/92.jpg)
92
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
http://www.ensembl.org/
![Page 93: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/93.jpg)
93
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
http://www.ensembl.org/
![Page 94: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/94.jpg)
94
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
A schematic image of the various human chromosomes
http://www.ensembl.org/
![Page 95: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/95.jpg)
95
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
the Chromosome 15 data subset
http://www.ensembl.org/
![Page 96: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/96.jpg)
96
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
http://www.ensembl.org/
![Page 97: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/97.jpg)
97
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
GenBank Entry U90223 for human dUTPase gene
http://www.ensembl.org/
![Page 98: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/98.jpg)
98
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
http://www.ensembl.org/
![Page 99: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/99.jpg)
99
Introduction to BioinformaticsEnglish Courses for Graduate Students
Nucleotide DatabasesExploring the Human Genome : the Internet home page of Ensembl
Human DUT ID card - everything you ever wanted to know about this gene can be found.
http://www.ensembl.org/
![Page 100: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/100.jpg)
100
Introduction to BioinformaticsEnglish Courses for Graduate Students
Establishing the G+C content of your sequence
The GC content of a molecule of DNA is the percentage of the total nitrogenous base in the DNA that is either guanine or cytosine. GC content is a very interesting property of DNA sequences because it is correlated to repeats and gene deserts. DNA with high GC-content is more stable than DNA with low GC-content. In PCR experiments, the GC-content of primers are used to predict their annealing temperature to the template DNA. A higher GC-content level indicates a higher melting temperature.
ORIGIN cagagaaaat caaaaagcag gccacgcaggaccccgatat cgtcgcaggc gttgccgcacttgccgccga aacaaataat gtggaagaatacgcccggca aaaacgtatc cgtaaaaaccttgatctgat ctgcgcgaac gatgtttccc//
http://www.genomatix.deTools for Nucleotide Sequences
![Page 101: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/101.jpg)
101
Introduction to BioinformaticsEnglish Courses for Graduate Students
Establishing the G+C content of your sequence : Genomatix
Tools for Nucleotide Sequenceshttp://www.genomatix.de
![Page 102: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/102.jpg)
102
Introduction to BioinformaticsEnglish Courses for Graduate Students
Establishing the G+C content of your sequence : Genomatix
Tools for Nucleotide Sequenceshttp://www.genomatix.de
![Page 103: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/103.jpg)
103
Introduction to BioinformaticsEnglish Courses for Graduate Students
Establishing the G+C content of your sequence : Genomatix
http://1.51.212.243/X01714.fasta
Tools for Nucleotide Sequences
http://1.51.212.243/X01714.fastahttp://1.51.212.243/X01714.fasta
http://www.genomatix.de
![Page 104: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/104.jpg)
104
Introduction to BioinformaticsEnglish Courses for Graduate Students
Establishing the G+C content of your sequence : Genomatix
Tools for Nucleotide Sequenceshttp://www.genomatix.de
![Page 105: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/105.jpg)
105
Introduction to BioinformaticsEnglish Courses for Graduate Students
Establishing the G+C content of your sequence : Genomatix
Tools for Nucleotide Sequenceshttp://www.genomatix.de
![Page 106: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/106.jpg)
106
4 different nucleotides16 different dinucleotides64 different trinucleotides (3-tuples)256 different 4-tuples1024 different 5-tuples4096 different 6-tuples (hexamer)
Identifying hexamers (6-tuples) with unexpected high frequencies in a set of sequences (such as promoter regions) is often the starting point for discovering regulatory sequence motifs.
The EMBOSS server (EBI), offers an online version of the program wordcount that allows you to compute the word frequency in your DNA sequence for any size. http://emboss.bioinformatics.nl
Introduction to BioinformaticsEnglish Courses for Graduate Students
Counting long words in DNA sequences : Wordcount
Tools for Nucleotide Sequenceshttp://emboss.bioinformatics.nl
![Page 107: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/107.jpg)
107
Introduction to BioinformaticsEnglish Courses for Graduate Students
Counting long words in DNA sequences : : Wordcount
Tools for Nucleotide Sequenceshttp://emboss.bioinformatics.nl
![Page 108: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/108.jpg)
108
Introduction to BioinformaticsEnglish Courses for Graduate Students
6
Tools for Nucleotide Sequences
http://1.51.212.243/X01714.fasta
http://emboss.bioinformatics.nl
![Page 109: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/109.jpg)
109
Introduction to BioinformaticsEnglish Courses for Graduate Students
Tools for Nucleotide Sequenceshttp://emboss.bioinformatics.nl
![Page 110: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/110.jpg)
110
Protein-coding genes have vastly different structures in microbes and multi cellular organisms.
In microbes, each protein is encoded by a simple DNA segment, from start to end, called an open reading frame (ORF).
In animal and plant genes, proteins are encoded in several pieces called exons, separated by non-coding DNA segments called introns.
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions
prokaryotes (archaea) eukaryotes
Tools for Nucleotide Sequences
![Page 111: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/111.jpg)
111
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBIhttp://www.ncbi.nlm.nih.gov
Tools for Nucleotide Sequences
![Page 112: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/112.jpg)
112
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
Tools for Nucleotide Sequenceshttp://www.ncbi.nlm.nih.gov
![Page 113: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/113.jpg)
113
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
Tools for Nucleotide Sequenceshttp://www.ncbi.nlm.nih.gov
![Page 114: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/114.jpg)
114
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
AE008569
1 5000
1
2
3AE008569 : Rickettsia conoriigenome (bacterium)
Tools for Nucleotide Sequenceshttp://www.ncbi.nlm.nih.gov
![Page 115: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/115.jpg)
115
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
Tools for Nucleotide Sequenceshttp://www.ncbi.nlm.nih.gov
![Page 116: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/116.jpg)
116
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
1
2
Tools for Nucleotide Sequenceshttp://www.ncbi.nlm.nih.gov
![Page 117: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/117.jpg)
117
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
Tools for Nucleotide Sequences
or
http://www.ncbi.nlm.nih.gov
![Page 118: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/118.jpg)
118
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : ORF Finder at NCBI
This program is also good for finding protein-coding regions for higher organisms, if your sequence is a cDNA.
cDNA – don’t include intronsand they have a simple, microbe-like ORF structure.
Tools for Nucleotide Sequenceshttp://www.ncbi.nlm.nih.gov
![Page 119: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/119.jpg)
119
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : GeneMarkhttp://exon.gatech.edu
The simplest ORF finding programs can probably correctly identify 85% percent of the protein-coding regions you may be interested in. However, in some cases, you may need to : • Finding very short proteins• Resolving uncertain cases where overlapping ORFs are predicted in
different reading frames, on the direct and reverse strand, for instance• Pinpoint the exact Start codon (the most distal ATG isn’t always the
correct one)
GeneMark - searches for coding regions using a criterion that’s a bit more sophisticated than “it has to be an uninterrupted reading frame longer than a certain length.” This program also takes into account the statistical properties of your sequence and associates some sort of a probabilistic quality index to each candidate’s ORFs.
Tools for Nucleotide Sequences
![Page 120: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/120.jpg)
120
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : GeneMark
Tools for Nucleotide Sequenceshttp://exon.gatech.edu
![Page 121: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/121.jpg)
121
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : GeneMark
http://1.51.212.243/AE008569.seq
Tools for Nucleotide Sequenceshttp://exon.gatech.edu
![Page 122: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/122.jpg)
122
Introduction to BioinformaticsEnglish Courses for Graduate Students
Tools for Nucleotide Sequenceshttp://exon.gatech.edu
![Page 123: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/123.jpg)
123
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : MZEF
If you’re looking at a human genomic sequence, your first question should be: Do I have a protein-coding exon somewhere in there?
In eukaryotic DNA sequence, exons are separated by non-coding introns.According to what molecular biologists have worked out, a protein codingexon is an ORF flanked by two specific signals known as splice sites. Several programs exist that can recognizethese exons.
MZEF - developed by Dr. Michael Zhang at Cold Spring Harbor Lboratory on beautiful Long Island (USA).
http://rulai.cshl.eduTools for Nucleotide Sequences
![Page 124: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/124.jpg)
124
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : MZEF
Tools for Nucleotide Sequenceshttp://rulai.cshl.edu
![Page 125: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/125.jpg)
125
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : MZEF
Tools for Nucleotide Sequenceshttp://rulai.cshl.edu
![Page 126: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/126.jpg)
126
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : MZEF
Tools for Nucleotide Sequenceshttp://rulai.cshl.edu
![Page 127: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/127.jpg)
127
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : MZEF
http://1.51.212.243/AF018429.fastahttp://1.51.212.243/AF018429.fasta
Tools for Nucleotide Sequenceshttp://rulai.cshl.edu
![Page 128: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/128.jpg)
128
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding Protein-Coding Regions : MZEF
Tools for Nucleotide Sequenceshttp://rulai.cshl.edu
![Page 129: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/129.jpg)
129
Introduction to BioinformaticsEnglish Courses for Graduate Students
Beijing Gene Finder (BGF) - http://tlife.fudan.edu.cn/bgf (eukaryotes)GeneFinder - http://cgap.nci.nih.gov/Genes/GeneFinderGENEID - http://genome.crg.es/software/geneidGenlang - http://arete.ibb.waw.pl/PL/html/gene_lang.htmlGENSCAN - http://genes.mit.edu/GENSCAN.html (eukaryotes)Glimmer - http://www.cbcb.umd.edu/software/glimmer (prokarytoes, archaea)GlimmerM - http://www.cbcb.umd.edu/software/glimmerm (eukaryotes)GrailEXP - http://compbio.ornl.gov/grailexp……
Finding Protein-Coding Regions
Tools for Nucleotide Sequences
![Page 130: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/130.jpg)
130
Introduction to BioinformaticsEnglish Courses for Graduate Students
Tools for Nucleotide Sequences
Information Page
![Page 131: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/131.jpg)
131
Protein Structure DB
Introduction to BioinformaticsEnglish Courses for Graduate Students
Classification of Biological DatabasesProtein Database
Primary Protein Database
INSDCUniProt
Protein Sequence DB
Secondary Protein Database
Nucleotide DatabasePrimary Nucleotide Database
Secondary Nucleotide Database
Specific Database
>2000
![Page 132: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/132.jpg)
132
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesPrimary Protein Sequence Databases
UniProt Knowledgebase (UniProtKB) is a central protein database of ExPASy maintained by Swiss Institute of Bioinformatics (SIB) and European Bioinformatics Institute (EBI), consisting of two sections:
UniProtKB/Swiss-Prot - a reviewed, manually annotated, non-redundant protein sequence database. It combines informationextracted from scientific literature and biocurator- evaluatedcomputational analysis.
UniProtKB/TrEMBL - contains high-quality computationally analyzed records, which are enriched with automatic annotation. It was introduced in response to increased dataflow resulting from genome projects, as the time- and labour-consuming manual annotation process of UniProtKB/Swiss-Prot could not be broadened to include all available protein sequences.
![Page 133: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/133.jpg)
133
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesPrimary Protein Sequence Databases
The Protein Information Resource (PIR), is an integrated public bioinformatics resource to support genomic and proteomic research, and scientific studies.
In 2002, PIR along with its international partners, EBI and SIB, were awarded a grant from NIH to create UniProt, a single worldwide database of protein sequence and function, by unifying the PIR-PSD, Swiss-Prot, and TrEMBL databases.
![Page 134: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/134.jpg)
134
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
Proteins are much simpler objects than genes.
• Proteins correspond to relatively small sequences (350 amino acids long, on the average).
• Unlike genes, proteins have clear beginnings and clear ends.
• Proteins are defined on a single strand.
• Whatever modifications occur between the ORF sequence and the mature protein, the amino acids they contain remain in the same order.
Use the human epidermal growth factor receptor (EGFR) as an example.
UniprotKB/Swiss-Prot home page (ExPASy) : http://expasy.org/
http://expasy.org/
![Page 135: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/135.jpg)
135
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/
1
2 3
P00533
human epidermal growth factor receptor (EGFR)
![Page 136: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/136.jpg)
136
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
P00533http://expasy.org/
![Page 137: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/137.jpg)
137
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
P00533
General InformationEntry Name, Accession Number, Secondary Accession Number, Last Modification Date.
http://expasy.org/
![Page 138: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/138.jpg)
138
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
P00533
The E.C. number (2.7.10.1) encodes the biochemical reaction that this protein performs. E.C. stands for Enzyme Nomenclature Committee. It can provides you a complete understanding of this protein enzymatic function.
http://expasy.org/
![Page 139: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/139.jpg)
139
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
P00533http://expasy.org/
![Page 140: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/140.jpg)
140
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
P00533http://expasy.org/
![Page 141: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/141.jpg)
141
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
P00533
This section provides a simple list of terms relevant to your current protein. Clicking any one of these keywords brings out a list of all Swiss-Prot entries that contain the same term. With the increasing size of the database, it seems that this type of query is not useful anymore.
http://expasy.org/
![Page 142: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/142.jpg)
142
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
Other proteins it interacts with
![Page 143: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/143.jpg)
143
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
Description of the existence of related protein sequence(s) produced by alternative splicing of the same gene, alternative promoter usage, ribosomal frame-shifting or by the use of alternative initiation codons.
![Page 144: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/144.jpg)
144
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
![Page 145: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/145.jpg)
145
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
or
![Page 146: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/146.jpg)
146
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
or
![Page 147: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/147.jpg)
147
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
![Page 148: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/148.jpg)
148
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
Only the entry, whose 3D structure was experimentally determined and submitted to the PDB, has this secondary structure annotation. (No computational result here!)
![Page 149: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/149.jpg)
149
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
![Page 150: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/150.jpg)
150
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
The Cross-References section contains links to entries in other databases that contain some information about this protein.
![Page 151: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/151.jpg)
151
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
Information Page
![Page 152: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/152.jpg)
152
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
Each line begins with a two-character line code, which indicates the type of data contained in the line.
![Page 153: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/153.jpg)
153
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a UniprotKB/Swiss-Prot Entry
http://expasy.org/P00533
OptionalReference cross-reference(s)RX
OptionalReference comment(s)RC
Once or moreReference positionRP
Once or moreReference numberRN
OptionalOrganism hostOH
OnceTaxonomy cross-referenceOX
Once or moreOrganism classificationOC
OptionalOrganelleOG
Once or moreOrganism speciesOS
OptionalGene name(s)GN
Once or moreDescriptionDE
Three timesDateDT
Once or moreAccession number(s)AC
Once; starts the entryIdentificationID
Occurrence in an entryContentLine
code
Once; ends the entryTermination line//
Once or moreSequence datablanks
OnceSequence headerSQ
Once or moreFeature table dataFT
OptionalKeywordsKW
OnceProtein existencePE
OptionalDatabase cross-referencesDR
OptionalComments or notesCC
Once or moreReference locationRL
OptionalReference titleRT
Once or more (Optional if RG line)Reference authorsRA
Once or more (Optional if RA line)Reference groupRG
Occurrence in an entryContentLine code
![Page 154: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/154.jpg)
154
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesPrimary Protein Structure Databases
The Protein Data Bank (PDB) is a repository for the 3D structural data of large biological molecules, such as proteins and nucleic acids (mainly proteins). The data, typically obtained by X-ray crystallography or NMR spectroscopy and submitted by biologists and biochemists from around the world. The structures in PDB are freely accessible on the Internet via the websites of its member organizations (PDBe, PDBj, and RCSB). The PDB is overseen by an organization called the Worldwide Protein Data Bank (wwPDB).
The Structural Classification of Proteins (SCOP) and CATH Protein Structure Classification (CATH) categorize structures according to type of structures stored in PDB and assumed their evolutionary relations.
Secondary Protein Structure Databases
![Page 155: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/155.jpg)
155
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a PDB Entry: E. coli dUTPase protein
http://www.rcsb.org3H6X
3H6XE coli dUTPase protein
![Page 156: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/156.jpg)
156
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases http://www.rcsb.org3H6X
![Page 157: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/157.jpg)
157
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a PDB Entry: E. coli dUTPase protein
http://www.rcsb.org3H6X
记事本
![Page 158: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/158.jpg)
158
Introduction to BioinformaticsEnglish Courses for Graduate Students
![Page 159: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/159.jpg)
159
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Title Section
This section contains records used to describe the experiment and the biological macromolecules present in the entry. Keywords in this section include:
HEADER, OBSLTE, TITLE, SPLIT, CAVEAT, COMPND, SOURCE, KEYWDS, EXPDTA, AUTHOR, REVDAT, SPRSDE, JRNL, REMARK.
![Page 160: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/160.jpg)
160
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
HEADER identifies a PDB entry through the idCode field. This record also provides a classification for the entry. Finally, it contains the date when the coordinates were deposited to the PDB archive.
![Page 161: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/161.jpg)
161
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
TITLE contains a title for the entry. It is also the title of the cited publication.
![Page 162: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/162.jpg)
162
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
COMPND describes the macromolecular contents of an entry.
![Page 163: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/163.jpg)
163
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
SOURCE specifies the biological and chemical source of each biological molecule in the entry.
![Page 164: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/164.jpg)
164
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
KEYWDS contains a set of terms relevant to the entry, which can be used for keyword search across databases.
![Page 165: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/165.jpg)
165
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
EXPDTA identifies the experimental technique used:
X-RAY DIFFRACTIONFIBER DIFFRACTIONNEUTRON DIFFRACTIONELECTRON CRYSTALLOGRAPHYELECTRON MICROSCOPY SOLID-STATE NMR SOLUTION NMR SOLUTION SCATTERING
![Page 166: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/166.jpg)
166
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
AUTHOR contains the names of the people responsible for the contents of the entry.
![Page 167: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/167.jpg)
167
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
REVDAT contains a history of the modifications made to an entry since its release.
![Page 168: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/168.jpg)
168
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
JRNL contains the primary literature citation that describes the experiment which resulted in the deposited coordinate set.
![Page 169: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/169.jpg)
169
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
REMARK presents experimental details, annotations, comments, and information not included in other records.
![Page 170: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/170.jpg)
170
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Primary Structure Section
This section contains the sequence of residues in each chain of the macromolecule(s).
DBREF, SEQADV, SEQRES, MODRES
![Page 171: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/171.jpg)
171
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Heterogen Section
This section contains the complete description of non-standard residues in the entry.
![Page 172: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/172.jpg)
172
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Secondary Structure SectionThis section describes helices, sheets, and turns found in protein and polypeptide structures.
![Page 173: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/173.jpg)
173
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Connectivity Annotation Section
This section specifies the existence and location of disulfide bonds and other linkages.
![Page 174: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/174.jpg)
174
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Crystallographic and Coordinate Transformation Section
This section describes the geometry of the crystallographic experiment and the coordinate system transformations.
![Page 175: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/175.jpg)
175
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
The most important part!!!
Coordinate SectionThis section contains the collection of atomic coordinates.
X Y Z
Atom index
Residue index
![Page 176: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/176.jpg)
176
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Connectivity Section
This section provides information on atomic connectivity.
![Page 177: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/177.jpg)
177
Protein Databases
Introduction to BioinformaticsEnglish Courses for Graduate Students
Reading a PDB Entry: E. coli dUTPase proteinhttp://www.rcsb.org
3H6X
Bookkeeping Section
This section provides some final information about the file itself.
A *.pdb file always ends with this “END”.
![Page 178: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/178.jpg)
178
Maestro
VMD
Pymol
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesReading a PDB Entry: E. coli dUTPase protein
http://www.rcsb.org3H6X
3H6X.pdb
Chapter 4 Structure
![Page 179: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/179.jpg)
179
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : RESID
http://www.ebi.ac.uk/RESID
RESID, the post-translational modification database maintained by John Garavelli at the European Bioinformatics Institute (EBI).
Finding out more about your protein : RESIDhttp://www.ebi.ac.uk/RESID
![Page 180: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/180.jpg)
180
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases
RESID, the post-translational modification database maintained by John Garavelli at the European Bioinformatics Institute (EBI).
Myristoylation (豆蔻酰化 )
Finding out more about your protein : RESIDhttp://www.ebi.ac.uk/RESID
![Page 181: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/181.jpg)
181
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases
RESID, the post-translational modification database maintained by John Garavelli at the European Bioinformatics Institute (EBI).
Finding out more about your protein : RESIDhttp://www.ebi.ac.uk/RESID
![Page 182: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/182.jpg)
182
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : RESID
http://www.ebi.ac.uk/RESID
![Page 183: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/183.jpg)
183
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : KEGG
http://www.genome.jp/kegg
KEGG - the Kyoto Encyclopedia of Genes and Genomes, was initiated by the Japanese human genome program in 1995. KEGG can be regarded as a "computer representation" of the biological system. Its collection includes genomes, enzymatic pathways, and biological chemicals. The PATHWAY database records networks of molecular interactions in the cells, and variants of them specific to particular organisms.
Home page of KEGG : http://www.genome.jp/kegg
![Page 184: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/184.jpg)
184
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : KEGG
http://www.genome.jp/kegg
![Page 185: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/185.jpg)
185
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : KEGG
http://www.genome.jp/kegg
![Page 186: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/186.jpg)
186
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : KEGG
http://www.genome.jp/kegg
![Page 187: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/187.jpg)
187
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : KEGG
http://www.genome.jp/kegg
![Page 188: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/188.jpg)
188
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein : KEGG
http://www.genome.jp/kegg
Compound
Enzyme
Pathway
![Page 189: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/189.jpg)
189
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases
Compoundrelevant Pathways
Enzyme
http://www.genome.jp/kegg
![Page 190: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/190.jpg)
190
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases http://www.genome.jp/kegg
![Page 191: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/191.jpg)
191
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases http://www.genome.jp/kegg
![Page 192: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/192.jpg)
192
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases http://www.genome.jp/kegg
![Page 193: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/193.jpg)
193
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding out more about your protein : KEGGhttp://www.genome.jp/keggProtein Databases
![Page 194: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/194.jpg)
194
Introduction to BioinformaticsEnglish Courses for Graduate Students
Toll-like Receptor (TLR) - recognize pathogen-associated molecular patterns (PAMPs) on invading organisms but not on hosts and are the first line of defense in innate immunity.
Finding out more about your protein : KEGGhttp://www.genome.jp/keggProtein Databases
![Page 195: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/195.jpg)
195
Introduction to BioinformaticsEnglish Courses for Graduate Students
Removing vector sequenceshttp://www.genome.jp/keggProtein Databases
![Page 196: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/196.jpg)
196
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://www.genome.jp/keggProtein Databases
![Page 197: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/197.jpg)
197
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://www.genome.jp/keggProtein Databases
![Page 198: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/198.jpg)
198
Introduction to BioinformaticsEnglish Courses for Graduate Students
Finding out more about your protein : KEGGhttp://www.genome.jp/keggProtein Databases
TLR4
LIPID of LPS
MD2
Park et al. 2009
![Page 199: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/199.jpg)
199
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://www.genome.jp/keggProtein Databases
![Page 200: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/200.jpg)
200
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://www.genome.jp/keggProtein Databasesautoimmunity caused by over-activity of TLRs : Systemic Lupus Erythematosus (SLE) 系统性红斑狼疮
Agonist Antagonist
![Page 201: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/201.jpg)
201
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases http://expasy.org
ProtParam - a program you can use online on the ExPASy server, is a convenient way to estimate every simple physico-chemical property, include the molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity.
Finding out more about your protein : ProtParam
![Page 202: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/202.jpg)
202
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://expasy.orgProtein DatabasesFinding out more about your protein : ProtParam
http://expasy.orgFinding out more about your protein : ProtParam
![Page 203: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/203.jpg)
203
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases http://expasy.orgFinding out more about your protein : ProtParam
Protein DatabasesFinding out more about your protein : ProtParam
http://expasy.orgProtein DatabasesFinding out more about your protein : ProtParam
P05130
![Page 204: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/204.jpg)
204
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://expasy.orgProtein DatabasesFinding out more about your protein : ProtParam
P05130
![Page 205: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/205.jpg)
205
Introduction to BioinformaticsEnglish Courses for Graduate Students
http://expasy.orgProtein DatabasesFinding out more about your protein : ProtParam
P05130
![Page 206: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/206.jpg)
206
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: WebLogo
Sequence logos - are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richer and precise description of, for example, a binding site.
![Page 207: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/207.jpg)
207
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: WebLogo
WebLogo - is a web based application designed to make the generation of sequence logos easy and painless. WebLogo has featured in over 150 scientific publications. http://weblogo.berkeley.edu
http://weblogo.berkeley.edu
![Page 208: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/208.jpg)
208
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: WebLogo
http://1.51.212.243/promoter.seqs
http://weblogo.berkeley.edupromoter.seqs
![Page 209: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/209.jpg)
209
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: WebLogo
http://weblogo.berkeley.edupromoter.seqs
![Page 210: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/210.jpg)
210
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: WebLogo
http://1.51.212.243/promoter.seqs
20 30
http://weblogo.berkeley.edupromoter.seqs
![Page 211: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/211.jpg)
211
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: WebLogo
In the promoter region of genes, we usually found a special fragment, called TATA box (also called Goldberg-Hogness box). The TATA box has the core DNA sequence 5'-TATAAA-3' or a variant. It is usually found as the binding site of RNA polymerase II.
http://correlogo.abcc.ncifcrf.gov
http://weblogo.berkeley.edupromoter.seqs
![Page 212: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/212.jpg)
212
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
Sequence Motif - a nucleotide or amino-acid sequence pattern that is widespread and has, or is conjectured to have, a biological significance.
An example is the N-glycosylation site motif:
Asn, followed by anything but Pro, followed by either Ser or Thr, followed by anything but Pro
This pattern can be written as N{P}[ST]{P}(Regular expression), where N=Asn, P=Pro, S=Ser, T=Thr; {X} means any amino acid except X; and [XY] means either X or Y. The notation [XY] does not give any indication of the probability of X or Yoccurring in the pattern. Observed probabilities can be graphically represented using sequence logos.
![Page 213: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/213.jpg)
213
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
The MEME Suite - Motif-based sequence analysis tools.
The MEME Suite allows you to:• discover motifs on groups of related DNA or protein sequences, • search sequence databases using motifs, • compare a motif to all motifs in a database of motifs. Home page : http://meme.sdsc.edu/meme/intro.html
http://meme.sdsc.edu/meme/intro.html
![Page 214: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/214.jpg)
214
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
http://meme.sdsc.edu/meme/intro.html
![Page 215: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/215.jpg)
215
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
http://1.51.212.243/meme.seqs
http://meme.sdsc.edu/meme/intro.htmlmeme.seqs
![Page 216: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/216.jpg)
216
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
meme.seqshttp://meme.sdsc.edu/meme/intro.html
meme.seqshttp://meme.sdsc.edu/meme/intro.html
![Page 217: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/217.jpg)
217
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
meme.seqshttp://meme.sdsc.edu/meme/intro.html
![Page 218: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/218.jpg)
218
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein: MEME
meme.seqshttp://meme.sdsc.edu/meme/intro.html
![Page 219: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/219.jpg)
219
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein Databases meme.seqshttp://meme.sdsc.edu/meme/intro.html
![Page 220: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/220.jpg)
220
Introduction to BioinformaticsEnglish Courses for Graduate Students
Protein DatabasesFinding out more about your protein
The Nuclear Protein Database (NPD) - a searchable database of information on proteins that are localized to the nucleus of vertebrate cells. http://npd.hgu.mrc.ac.uk/user
SignalP 3.0 server - predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms. http://www.cbs.dtu.dk/services/SignalP
Phospho.ELM - a database of S/T/Y phosphorylation sites. http://phospho.elm.eu.org
SYSTERS - protein family database of large-scaleprotein clustering based on sequence similarity
More tools and databases, please see Information Page
![Page 221: Introduction to Bioinformatics - Shandong University · 7 Introduction to Bioinformatics English Courses for Graduate Students Biological Databases - A biological database is a collection](https://reader036.vdocuments.net/reader036/viewer/2022062505/5ede04b1ad6a402d666947d9/html5/thumbnails/221.jpg)
221
Introduction to BioinformaticsEnglish Courses for Graduate Students