application of bioinformatics in genetic research instructors: dr. henry baker dr. luciano...
TRANSCRIPT
![Page 1: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/1.jpg)
Application of Bioinformatics in Genetic Research
Instructors:
Dr. Henry Baker
Dr. Luciano Brocchieri
Dr. Michele Tennant
Dr. Lei Zhou
http://159.178.28.30/GMS6014/home.htm
![Page 2: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/2.jpg)
Application of Bioinformatics in Genetic Research
Time and location:
Monday: 12:00-12:50 in CGRC291.
Wednesday: 12:00-12:50 or 11:40-12:30, CGRC-291
Fridays (11/18. 12/2): 12:00-12:50 in CGRC-391 or 11:40-12:30 in CGRC291.
![Page 3: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/3.jpg)
Evaluation
• 50% classroom participation
• 50% homework
![Page 4: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/4.jpg)
History of bioinformatics – sequence analysis
• Sequence comparison
• Similarity search
• Phylogenetic analysis
• Structure predication
• Gene prediction
![Page 5: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/5.jpg)
Bioinformatics in the post genome era
• Information Representation.- many new types of data, such as Function,
Location, Interaction, Regulatory pathway, Expression profile, etc. needs to be recorded
• Data Management
- Infrastructure for inputting, managing, access and retrieval of relevant information in a “sea of databases”. Cloud computing.
• Systematics
The opportunity provided by genome sequence and genomic / proteomic technology is matched by the
challenge to bioinformatics / computational biology
![Page 6: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/6.jpg)
Bioinformatics in the post genome era
• SNP and whole genome wide association studies.
• Genomic expression profiling (RNA and protein levels).
• Comparative genomics, Epigenomics …• Individual genomes, epigenomes,
transcriptomes.
• Regulatory pathway simulation – systems biology.
$1,000 genome and … $500,000 analysis ?
![Page 7: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/7.jpg)
Objectives of GMS6014
• Basic skills for retrieving and storing data, using web-based applications.
• Ability to install and run stand alone local applications.
• Understanding the basis of bioinformatics applications using sequence similarity search as the example.
• A brief survey of available bioinformatics tools and introduction to functional genomics and systems biology.
![Page 8: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/8.jpg)
Sequence Representation - nucleotide
N G R C W T G Y C Y
A G A C A T G C C CC G T T TGT
For complete list, see table 2.1, Mount 2nd Ed
Or http://www.ncbi.nlm.nih.gov/blast/fasta.shtml
![Page 9: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/9.jpg)
Sequence Representation - amino acids
Q:
What’s the common property of these amino acids ?
1. D, E
2. I, L, V, M, F
3. A, S, P
![Page 10: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/10.jpg)
Sequence Representation - amino acids
Example:
Coloring based on aa property.
W D L L A Q I L C Y A L R I Y
W R F L A T V V L E T L R Q Y
W K F L A I T M C K V L K Q F
R C L L C N K L Y Y L L R K V
L N R L L A E L Y E V L C H I
L R L L Q Q Q Q M V L Q R Q Y
W D L L A Q I L C Y A L R I Y
W R F L A T V V L E T L R Q Y
W K F L A I T M C K V L K Q F
R C L L C N K L Y Y L L R K V
L N R L L A E L Y E V L C H I
L R L L Q Q Q Q M V L Q R Q Y
![Page 11: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/11.jpg)
Representation of sequence – sequence file format
1.) FASTA – simple and clean
> gene_name, (other info)
MASASASKJHKLJLKJLDSDFSF
SSDSASFSFD…
Practice / DIY: retrieve sequence in Fasta format and save the file in the local computer.
![Page 12: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/12.jpg)
How to store sequence files
• .txt format is clean and allows down stream sequence analysis
• .doc or .rtf allows formatting during annotation – however, extra information are inserted thus NOT suitable for computational analysis.
![Page 13: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/13.jpg)
Practice – file types
• Using Windows Explorer (with your own computer) or IE with “C:\” in the address window.
• Change the “ToolsFolder Options” so that the file extensions (.xxx) are revealed.
• Edit the downloaded sequence file in MS Word, highlight a section of the sequence with Bold font or color and save as .doc
• Open the .doc file in NotePad – observe the inserted characters.
![Page 14: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/14.jpg)
Practice – file types (Cont.)
• Load the .doc file to Webcutter using “Browse” and then “Upload sequence file”.-Notice that the “sequence” in the sequence box are
nonsense characters.
• Clear input; Browse and then load the .txt file. Run an analysis.
Always keep you sequences in .txt file for downstream analysis.
![Page 15: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/15.jpg)
Representation of sequence
The need to include annotations and functional information with each sequence.
• Structured data entry
• GeneBank
• EMBL / SwissProt
Observe: The difference of data structure between SwissProt, NCBI protein, and NCBI Genes.
![Page 16: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/16.jpg)
Representation of sequence
The need to represent associated info with sequence
• Structured data entry
• Specialized databases3-d StructureMutation / Diseases Protein family / Protein domainInteractionPathway….
![Page 17: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/17.jpg)
Representation of sequence
The need to represent associated info with sequence
• Structured data entry
• Specialized databases
• Complex / customized data structure
- Object-oriented data representation (Mount, p44-45)
![Page 18: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/18.jpg)
XML – Extensible Markup language
Define highly structured data for sharing and exchange.
Observe:
1.) The differences between the XML format and the GenPept format.
2.) The differences among XML, TinySeqXML, and INSDXML.
![Page 19: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/19.jpg)
![Page 20: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/20.jpg)
Bioinformatics / Computational biology
• Bioinformatics - Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.
• Computational Biology - The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.
(Working Definition of Bioinformatics and Computational Biology - July 17, 2000). NIH / BISTI
![Page 21: Application of Bioinformatics in Genetic Research Instructors: Dr. Henry Baker Dr. Luciano Brocchieri Dr. Michele Tennant Dr. Lei Zhou](https://reader031.vdocuments.net/reader031/viewer/2022020417/5697bf981a28abf838c91433/html5/thumbnails/21.jpg)
Genetic code
• Codon usage
• special code – mitochondria genes