genomics and personalized care lab session leming zhou, phd school of health and rehabilitation...
TRANSCRIPT
![Page 1: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/1.jpg)
Genomics and Personalized Care
Lab Session
Leming Zhou, PhDSchool of Health and Rehabilitation
SciencesDepartment of Health Information
Management
![Page 2: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/2.jpg)
Outline• Nucleotide, protein, genetic variation,
gene and disease association databases– NCBI
• GenBank; protein structure; dbSNP; OMIM
• Pairwise sequence alignment• BLAST search• UCSC genome browser
![Page 3: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/3.jpg)
NCBI• Created as a part of National Library of
Medicine in 1988– Establish public databases
– Perform research in computational biology
– Develop software tools for sequence analysis
– Disseminate biomedical information
• Databases– Sequence, such as GeneBank, RefSeq, dbSNP
– Literature, such as PubMed, OMIM
• Tools– Entrez. Blast, Cn3D, etc.
![Page 4: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/4.jpg)
NCBI Homepage
![Page 5: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/5.jpg)
GenBank• Nucleotide only sequence database
• GenBank Data– Direct submissions individual records (BankIt, Sequin)
– Batch submissions via email (EST, GSS, STS)
– ftp accounts established for sequencing centers
• Data shared nightly amongst three collaborating databases:– GenBank
– DNA Database of Japan (DDBJ).
– European Molecular Biology Laboratory Database (EMBL)
![Page 6: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/6.jpg)
GenBank Record (Header)LOCUS NM_001963 4913 bp mRNA linear PRI 20-SEP-2009
DEFINITION Homo sapiens epidermal growth factor (beta-urogastrone) (EGF), mRNA.
ACCESSION NM_001963
VERSION NM_001963.3 GI:166362727
KEYWORDS .
SOURCE Homo sapiens (human)
ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo.
REFERENCE 1 (bases 1 to 4913)
AUTHORS Hosgood,H.D. III, Menashe,I., He,X., Chanock,S. and Lan,Q.
TITLE PTEN identified as important risk factor of chronic obstructive pulmonary disease
JOURNAL Respir Med (2009) In press
PUBMED 19625176
REMAKR GeneRIF: Observational study of gene-disease association.
![Page 7: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/7.jpg)
GenBank Record (Features)FEATURES Location/Qualifiers
source 1..4913 /organism="Homo sapiens" /mol_type="mRNA" /db_xref="taxon:9606" /chromosome="4" /map="4q25"
gene 1..4913 /gene="EGF" /gene_synonym="HOMG4; URG" /note="epidermal growth factor (beta-urogastrone)" /db_xref="GeneID:1950" /db_xref="HGNC:3229" /db_xref="HPRD:00578" /db_xref="MIM:131530"
exon 1..579 /gene="EGF" /gene_synonym="HOMG4; URG" /inference="alignment:Splign" /number=1
CDS 453..4076 /gene="EGF" /gene_synonym="HOMG4; URG" /note="beta-urogastrone" /codon_start=1 /product="epidermal growth factor precursor" /protein_id="NP_001954.2" /db_xref="GI:166362728" /db_xref="CCDS:CCDS3689.1" /db_xref="GeneID:1950" /db_xref="HGNC:3229" /db_xref="HPRD:00578" /db_xref="MIM:131530" /translation="MLLTLIILLPVVSKFSFVSLSAPQHWSCPEGTLAGNGNSTCVGP APFLIFSHGNSIFRIDTEGTNYEQLVVDAGVSVIMDFHYNEKRIYWVDLERQLLQRVF
![Page 8: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/8.jpg)
GenBank Record (Sequence)ORIGIN
1 aaaaagagaa actgttggga gaggaatcgt atctccatat ttcttctttc agccccaatc
61 caagggttgt agctggaact ttccatcagt tcttcctttc tttttcctct ctaagccttt
121 gccttgctct gtcacagtga agtcagccag agcagggctg ttaaactctg tgaaatttgt
181 cataagggtg tcaggtattt cttactggct tccaaagaaa catagataaa gaaatctttc
241 ctgtggcttc ccttggcagg ctgcattcag aaggtctctc agttgaagaa agagcttgga
301 ggacaacagc acaacaggag agtaaaagat gccccagggc tgaggcctcc gctcaggcag
361 ccgcatctgg ggtcaatcat actcaccttg cccgggccat gctccagcaa aatcaagctg
421 ttttcttttg aaagttcaaa ctcatcaaga ttatgctgct cactcttatc attctgttgc
481 cagtagtttc aaaatttagt tttgttagtc tctcagcacc gcagcactgg agctgtcctg
541 aaggtactct cgcaggaaat gggaattcta cttgtgtggg tcctgcaccc ttcttaattt
601 tctcccatgg aaatagtatc tttaggattg acacagaagg aaccaattat gagcaattgg
661 tggtggatgc tggtgtctca gtgatcatgg attttcatta taatgagaaa agaatctatt
721 gggtggattt agaaagacaa cttttgcaaa gagtttttct gaatgggtca aggcaagaga
781 gagtatgtaa tatagagaaa aatgtttctg gaatggcaat aaattggata aatgaagaag
841 ttatttggtc aaatcaacag gaaggaatca ttacagtaac agatatgaaa ggaaataatt
901 cccacattct tttaagtgct ttaaaatatc ctgcaaatgt agcagttgat ccagtagaaa
961 ggtttatatt ttggtcttca gaggtggctg gaagccttta tagagcagat ctcgatggtg
![Page 9: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/9.jpg)
FASTA: Sequence Format
![Page 10: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/10.jpg)
![Page 11: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/11.jpg)
Protein Structure
![Page 12: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/12.jpg)
![Page 13: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/13.jpg)
Crystal Structure of a Protein
![Page 14: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/14.jpg)
Protein Structure Databases• Proteins take on 3D structure
• 3D data for some proteins is available due to techniques such as NMR and X-Ray crystallography– PDB http://www.pdb.org/
– SCOP http://scop.mrc-lmb.cam.ac.uk/scop
– MMDB http://www.ncbi.nlm.nih.gov/Structure/
![Page 15: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/15.jpg)
Genetic Variations
![Page 16: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/16.jpg)
Polymorphisms• Genomic sequences from two unrelated
individuals are 99.9% identical.
• The 0.1% difference is due to genetic variations, and mainly (~90%) one form of variation called Single Nucleotide Polymorphisms (SNPs, single-base variations).
![Page 17: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/17.jpg)
Importance of Genetic Variations• Genetic variations underlie phenotypic
differences among different individuals
• Genetic variations determine our predisposition to diseases and responses to drugs, therapies, and environmental insults such as bacteria, virus, and chemicals
• Genetic variations reveal clues of ancestral human migration history
![Page 18: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/18.jpg)
Major Types of Genetic Variations• Single nucleotide mutation
– Majority of SNPs do NOT directly contribute to any phenotypes
• Insertion or deletion of one or more nucleotides– Tandem repeat polymorphisms (Genomic regions consisting of
variable length, usually 1-100 bases long, of sequence motifs repeating in tandem with variable copy number)
• Used as genetic markers for DNA finger printing (forensic, parentage testing)
• Many cause genetic diseases
– Insertion/Deletion polymorphisms (Often resulted from localized rearrangements between homologous tandem repeats)
• Gross chromosomal aberration– Deletions, inversions, or translocation of large DNA fragments
– Often causing serious genetic diseases
![Page 19: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/19.jpg)
The Effect of SNPs• The phenotypic consequence of a SNP is
significantly affected by the location where it occurs (gene or non-gene), as well as the nature of the mutation (synonymous or non-synonymous)– No consequence
– Affect gene transcription quantitatively or qualitatively
– Affect gene translation quantitatively or qualitatively
– Change protein structure and functions
– Change gene regulation at different steps
![Page 20: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/20.jpg)
Simple/Complex Genetic Diseases and SNPs• Simple genetic diseases (Mendelian diseases) are
often caused by mutations in a single gene– e.g. Huntington’s, Cystic fibrosis, etc.
• Many complex diseases are the result of mutations in multiple genes, the interactions among them as well as between the environmental factors– e.g. cancers, heart diseases, Alzheimer's, diabetes,
asthmas, obesity, etc.
![Page 21: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/21.jpg)
Genetic Variations Databases• dbSNP
– http://www.ncbi.nlm.nih.gov/SNP/
• Online Mendelian Inheritance in Man (OMIM)– http://www.ncbi.nlm.nih.gov/omim
• International HapMap Project– http://www.hapmap.org/
• Genome Variation Server (Seattle SNPs)– http://gvs.gs.washington.edu/GVS/
![Page 22: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/22.jpg)
dbSNP• The Single Nucleotide Polymorphism database (dbSNP) is
a public- domain archive for a broad collection of simple genetic variations
• This collection of polymorphisms includes:– Single-base nucleotide substitutions (or single nucleotide
polymorphisms -SNPs)
• Roughly 10 million in human population or on average 1 per 300 bps
• Less than half of these SNPs are identified and stored in the database
– Microsatellite repeat variations (or short tandem repeats - STRs)
• In sillico estimation of potentially polymorphic variable number tandem repeats (VNTR) are over 100,000 across the human genome
– Small-scale multi-base deletions or insertions
• The short insertion/deletions are difficult to quantify and the number is likely to fall in between SNPs and VNTR
![Page 23: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/23.jpg)
A dbSNP Record>gnl|dbSNP|ss5586300|allelePos=214|len=475|
taxid=9606|alleles='A/G'|mol=Genomic ATAAACATGG ACTTTTACAA AACCCATATC GTATACCACC ACTTTTTCCCATCAAGTCAT YTGTTAAAAC TAAATGTAAG AAAAATCTGC TAGAGGAAAACTTTGAGGAA CATTCAATRT CACCTGAAAG AGAAATGGGA AATGAGAACATTCCAAGTAC AGTGAGCACA ATTAGCCGTA ATAACATTAG AGAAAATGTT TTTAAAGRAG CCA R CTCAAGCAAT ATTAATGAAG TAGGTTCCAG TACTAATGAA GTGGGCTCCAGTATTAATGA AATAGGTTCC AGTGATGAAA ACATTCAAGC AGAACTAGGT AGAAACAGAG GGCCAAAATT GAATGCTATG CTTAGATTAG GGGTTTTGCA ACCTGAGGTC TATAAACAAA GTCTTCCTGG AAGTAATTGT AAGCATCCTGAAATAAAAAA GCAAGAATAT GAAGAAGTAG TTCAGACTGT TAATACAGAT TTCTCTCCAT A
![Page 24: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/24.jpg)
Different Ways to Search SNPs in dbSNP• dbSNP web site
– Direct search of SS record; batch search; allow SNP record submission; No search limit
• Entrez SNP
– http://www.ncbi.nlm.nih.gov/sites/entrez?db=Snp
– Search limits options allows precise retrieval
• Entrez Gene Record’s SNP Links Out Feature
– Direct links to corresponding SNP records; access to genotype and linkage disequilibrium data
• NCBI’s MapViewer
– Visualize SNPs in the genomic context along with other types of genetic data
![Page 25: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/25.jpg)
Search SNPs from dbSNP Web Page • http://www.ncbi.nlm.nih.gov/SNP/index.html
![Page 26: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/26.jpg)
Search SNPs from Entrez SNP Web Page• http://www.ncbi.nlm.nih.gov/sites/entrez?db=Snp
• The dbSNP is a part of the Entrez integrated information retrieval system and may be searched using either qualifiers or a combination search limits from 14 different categories
![Page 27: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/27.jpg)
Gene and Disease
![Page 28: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/28.jpg)
Disease Causing GenesDisease centric databases:
• OMIM: http://www.ncbi.nlm.nih.gov/omim/
• CDC HugeNavigator: http://hugenavigator.net/
• HGMD: https://portal.biobase-international.com/hgmd/pro/start.php
• A Catalog of Published Genome-Wide Association Studies: http://www.genome.gov/26525384
![Page 29: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/29.jpg)
Online Mendelian Inheritance in Man (OMIM)• http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM • OMIM is a human genetic disorders database built and curated
using results from published studies
• Each OMIM record provides a summary of the current state of knowledge of the genetic basis of a disorder, which contains the following information:
– description and clinical features of a disorder or a gene involved in genetic disorders;
– biochemical and other features;
– cytogenetics and mapping;
– molecular and population genetics;
– diagnosis and clinical management;
– animal models for the disorder;
– allelic variants.
• OMIM is searchable via NCBI Entrez, and its records are cross-linked to other NCBI resources.
![Page 30: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/30.jpg)
OMIM: Variant• The OMIM database includes genetic disorders
caused by various mutation/variation, from SNPs to large-scale chromosomal abnormalities
• Variants are represented by a 10-digit OMIM number, and can be searched in two ways– Search for a gene or a disease, when retrieved, view its
variants
![Page 31: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/31.jpg)
Variants in OMIM Records• For most genes, only selected mutations are
included – Criteria for inclusion include: the first mutation to be discovered, high
population frequency, distinctive phenotype, historic significance, unusual mechanism of mutation, unusual pathogenetic mechanism, and distinctive inheritance.
• Most of the variants represent disease-producing mutations, NOT polymorphisms.
• A few polymorphisms are included, many of which show a positive statistical correlation with particular common disorders.
• Few neutral polymorphisms are included in OMIM
• Some SNPs in the dbSNP records are not linked to the corresponding OMIM records.
![Page 32: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/32.jpg)
Similarity Search• Find statistically significant matches to a
protein or DNA sequence of interest. • Obtain information on inferred function of
the gene• Sequence identity/similarity is a
quantitative measurement of the number of nucleotides / amino acids which are identical /similar in two aligned sequences– Calculated from a sequence alignment
– Can be expressed as a percentage
– In proteins, some residues are chemically similar but not identical
![Page 33: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/33.jpg)
Sequence Alignment• A linear, one-to-one correspondence
between some of the symbols in one sequence with some of the symbols in another sequence– Four possible outcomes in aligning two
sequences
• Identity; mismatch; gap in one sequence; gap in the other sequence
• May be DNA or protein sequences.
![Page 34: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/34.jpg)
Alignment Algorithms• Sequences often contain highly conserved
regions
• These regions can be used for an initial alignment
![Page 35: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/35.jpg)
Alignments• Two sequencesSeq 1: ACGGACTSeq 2: ATCGGATCT• There may be multiple ways of creating
the alignment. Which alignment is the best?
A – C – G G – A C T| | | | |A T C G G A T - C T
A T C G G A T C T| | | | | |A – C G G – A C T
![Page 36: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/36.jpg)
BLAST• BLAST - Basic Local Alignment Search Tool: A
sequence comparison algorithm optimized for speed used to search sequence databases for optimal local alignments to a query
• Most widely used and referenced computational biology resource
• The central idea of the BLAST algorithm is to confine attention to segment pairs that contain a word pair of length W with a score of at least T when compared to the query using a substitution matrix
• Word hits are then extended in both directions to generate an alignment with score exceeding a given threshold S
![Page 37: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/37.jpg)
Four Steps of a BLAST search• Enter query sequence
• Select one BLAST program
• Choose the database to search
• Set optional parameters
![Page 38: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/38.jpg)
Enter Query Sequence• Sequence can be pasted into a text field in FASTA
format or as accession number• Sequence can also be uploaded as a file (FASTA
format)• Users may indicate a sequence range of the query
sequence instead of using the whole query sequence• Job title will be automatically generated from
sequence header
![Page 39: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/39.jpg)
Select one BLAST Program• BLAST Programs:
– BLASTN: DNA query sequence against a DNA database
– BLASTP: protein query sequence against a protein database
– BLASTX: DNA query sequence, translated into all six reading frames, against a protein database
– TBLASTN: protein query sequence against a DNA database, translated into all six reading frames
– TBLASTX: DNA query sequence, translated into all six reading frames, against a DNA database, translated into all six reading frames
• Choose the right one according to the purpose of the search
![Page 40: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/40.jpg)
![Page 41: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/41.jpg)
Choose the Database to Search• BLASTN
![Page 42: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/42.jpg)
page 93
Optional Parameters• Specify the organism to search or exclude
– Common name, taxonomy id, …• Exclude certain sequences
– Exclude predicted sequences or sequences from metagenomics
• Use Entrez query to select a subset of the blast database
![Page 43: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/43.jpg)
BLASTN Output (header)
![Page 44: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/44.jpg)
BLASTN Output (Graphic Summary)
matches to itself
probable homologs
distantly related
homologs
distant homolog with shared
domain or motif
![Page 45: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/45.jpg)
BLASTN Output (Descriptions)
![Page 46: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/46.jpg)
BLASTN Output (Sequence Alignments)
![Page 47: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/47.jpg)
Genome Browser
• Genome Browser is a computer program which helps to display gene maps, browse the chromosomes, align genes or gene models with ESTs or contigs etc.
• Big Three:– UCSC Genome Browser
– NCBI Mapviewer
– Ensemble
![Page 48: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/48.jpg)
UCSC Genome Browser• http://genome.ucsc.edu
![Page 49: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/49.jpg)
Organization of Genomic Data
Genome backbone: base position numbersequenceA
nnot
atio
n T
rack
s
chromosome band
known genes
predicted genes
evolutionary conservation
SNPs
sts sites
microarray/expression data
repeated regions
more…
Links out to more data
![Page 50: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/50.jpg)
UCSC Genome Browser
![Page 51: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/51.jpg)
UCSC Genome BrowserA
nno
tatio
n T
rack
s
sequenceSTS sites
Known gene
SNP
Evolutionary conservation
Repeated regions
Expression
![Page 52: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/52.jpg)
A Sample of the UCSC Genome Browser
gene details
An
nota
tion
Tra
cks
sequence
comparisons
SNPs
![Page 53: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/53.jpg)
Genome Browser Gateway
• Use this Gateway to search by:– Gene names, symbols, IDs
– Chromosome number: chr7, or region: chr11:1038475-1075482
– Keywords: kinase, receptor
• See lower part of page for help with format
![Page 54: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/54.jpg)
Genome Browser Gateway
Helpful search examples
samples provided
text/ID searches
![Page 55: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/55.jpg)
The Genome Browser Gateway
Make your Gateway choices:1. Select Clade2. Select genome = species: search 1 species at a time3. Assembly: the official backbone DNA sequence4. Position: location in the genome to examine5. Image width: how many pixels in display window; 5000 max6. Configure: make fonts bigger + other choices
4 51 32
assembly
6
![Page 56: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/56.jpg)
Different Species, Different Tracks
• Species may have different data tracks
• Layout, software, functions are the same
![Page 57: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/57.jpg)
Sample Genome Viewer Image, TP53
base position
UCSC genes
RefSeq genes
mRNAs & ESTs
repeats
many species compared
SNPs
single species compared
MGC clones
![Page 58: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/58.jpg)
Visual Cues on the Genome Browser
Track colors may have meaning—for example, UCSC Gene track:
•If there is a corresponding PDB entry = black•If there is a corresponding reviewed/validated seq = dark blue•If there is a non-RefSeq seq = lightest blue
Tick marks; a single location (STS, SNP)
For some tracks, the height of a bar is increased likelihood of an evolutionary relationship (conservation track)
Intron and direction of transcription <<< or >>>
<exon exon exon< < < < < < <ex 5' UTR3' UTR
Alignment indications (Conservation pairs: “chain” or “net” style)•Alignments = boxes, Gaps = lines
![Page 59: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/59.jpg)
Options for Changing Images: Upper Section
• Change your view or location with controls at the top
• Use “base” to get right down to the nucleotides
• Configure: to change font, window size, more…– Next item, next exon navigation assistance can be turned on
Specifya
position
Fonts,window,
next item,more
Walkleft orright
Zoomin
Zoomout
Click tozoom 3x
and re-center
![Page 60: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/60.jpg)
Annotation Track Display Options
• Some data is ON or OFF by default
• Menu links to info about the tracks: content, methods
• You change the view with pulldown menus
• After making changes, REFRESH to enforce the change
enforcechange
s
Enforcechanges
Change track view
Links to infoand/or filters
![Page 61: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/61.jpg)
Annotation Track Options Defined
• Hide: removes a track from view Dense: all items collapsed into a single line
Squish: each item = separate line, but 50% height
Pack: each item separate, but efficiently stacked (full height)
Full: each item on separate line
![Page 62: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/62.jpg)
Mid-page Options to Change Settings
• You control the views
• Use pulldown menus
• Configure options page
Reset, back to defaults Start from
scratch
Enforce any changes (hide, full, squish…)
Flip display to Genomic 3’5’
![Page 63: Genomics and Personalized Care Lab Session Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649eda5503460f94be9581/html5/thumbnails/63.jpg)
Base Level and Protein Sequences