sequence based analysis tutorial nih proteomics workshop cecilia arighi, ph.d. protein information...
TRANSCRIPT
![Page 1: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/1.jpg)
Sequence Based Analysis Tutorial
NIH Proteomics Workshop
Cecilia Arighi, Ph.D.Protein Information Resource at Georgetown University Medical Center
![Page 2: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/2.jpg)
22
Retrieval, Sequence Search & Classification Methods
Retrieve protein info by text / UID Sequence Similarity Search
BLAST, FASTA, Dynamic Programming Family Classification
Patterns, Profiles, Hidden Markov Models, Sequence Alignments, Neural Networks
Integrated Search and Classification System
![Page 3: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/3.jpg)
33
Sequence Similarity Search (I)
Based on Pair-Wise Comparisons Dynamic Programming Algorithms
Global Similarity: Needleman-Wunch Local Similarity: Smith-Waterman
Heuristic Algorithms FASTA: Based on K-Tuples (2-Amino Acid) BLAST: Triples of Conserved Amino Acids Gapped-BLAST: Allow Gaps in Segment Pairs PHI-BLAST: Pattern-Hit Initiated Search PSI-BLAST: Position-Specific Iterated Search
![Page 4: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/4.jpg)
44
Sequence Similarity Search (II) Similarity Search Parameters
Scoring Matrices – Based on Conserved Amino Acid Substitution
Dayhoff Mutation Matrix, e.g., PAM250 (~20% Identity)
Henikoff Matrix from Ungapped Alignments, e.g., BLOSUM 62
Gap Penalty Search Time Comparisons
Smith-Waterman: 10 Min FASTA: 2 Min BLAST: 20 Sec
![Page 5: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/5.jpg)
55
Feature Representation Features of Amino Acids: Physicochemical Properties,
Context (Local & Global) Features, Evolutionary Features Alternative Amino Acids: Classification of Amino Acids To
Capture Different Features of Amino Acid Residues
![Page 6: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/6.jpg)
66
Substitution Matrix Likelihood of One Amino Acid Mutated into Another Over
Evolutionary Time Negative Score: Unlikely to Happen (e.g., Gly/Trp, -7) Positive Score: Conservative Substitution (e.g., Lys/Arg, +3) High Score for Identical Matches: Rare Amino Acids (e.g., Trp, Cys)
![Page 7: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/7.jpg)
77
Secondary Structure Features Helix Patterns of Hydrophobic Residue Conservation
Showing I, I+3, I+4, I+7 Pattern Are Highly Indicative of an Helix (Amphipathic)
Strands That Are Half Buried in the Protein Core Will Tend to Have Hydrophobic Residues at Positions I, I+2, I+4, I+6
![Page 8: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/8.jpg)
88
BLASTBLAST (Basic Local Alignment Search Tool) Extremely fast Robust Most frequently used
It finds very short segment pairs (“seeds”) between the query and the database sequence
These seeds are then extended in both directions until the maximum possible score for extensions of this particular seed is reached
![Page 9: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/9.jpg)
99
BLAST Search From BLAST Search Interface Table-Format Result with BLAST Output and SSEARCH
(Smith-Waterman) Pair-Wise Alignment
Link to NCBI taxonomy
Click to seealignment
Links to iProClass and UniProtKB reports
Link to PIRSF report
Click to see SSearch alignment
![Page 10: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/10.jpg)
1010
Blast Result & Pairwise Alignment
BLAST Aligment
![Page 11: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/11.jpg)
1111
Classification
What is classification? Why do we need protein classification? Different levels of classification Basis for functional protein classification How to classify a protein of unknown
function?
![Page 12: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/12.jpg)
1212
Classification Databases
Protein motif
Protein domain
3-D structure Whole-protein
C - x(2,4) - C - x(3) - [LIVMFYWC] - x(8) - H - x(3,5) - H The 2 C's and the 2 H's are zinc ligands
Group proteins according to the presence of a common domain Group proteins according to
common 3D structure
Group proteins according to common domain architecture and length
![Page 13: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/13.jpg)
1313
Family Classification Methods
Based on Other Classification Information
Multiple Sequence Alignment (ClustalW)
ProSite Pattern Search Profile Search Hidden Markov Models (HMMs)
Domain (Pfam); Whole protein (PIRSF) Neural Networks
![Page 14: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/14.jpg)
1414
How do you build a tree?
Pick sequences to align Align them Verify the alignment Keep the parts that are aligned correctly Build and evaluate a phylogenetic tree Integrated Analysis
![Page 15: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/15.jpg)
1515
Pairwise alignment:Calculate distance matrix
Mean number of differences per residue
Unrooted Neighbor-Joining Tree Branch length drawn to scale
Rooted NJ Tree (guide tree)
Root place at a position where the means of the branch lengths on either side of the root are equal
Progressive Alignment guided by the tree
Alignment starts from the tips of the tree towards the root
Thompson et al., NAR 22, 4675 (1994).
Multiple Sequence Alignment: CLUSTALW
![Page 16: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/16.jpg)
1616
PIR Multiple Alignment and Tree From Text/Sequence Search Result or CLUSTAL W Alignment Interface
![Page 17: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/17.jpg)
1717
![Page 18: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/18.jpg)
1818
PIR Pattern Search From Text/Sequence Search Result or Pattern Search Interface
P-[IV]-[WY]-x(3)-H-[MR]-V-x(3,4)-Q-x(1,2)-D-x(4,5)-G-A-N
P-[IV]-[WY]-x(3)-H-[MR]-V-x(3,4)-Q-x(1,2)-D-x(4,5)-G-A-N
Alignment of a region involved in catalytic activity
Create Pattern and search in database:
A
B
O05689
Test sequence against PROSITE database
Signature Patterns for Functional Motifs
![Page 19: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/19.jpg)
1919
Pattern Search Result (I)A. One Query Pattern Against UniProtKB or UniRef100 DBs
Display the query pattern
Links to iProClass and UniProtKB reports
Link to NCBI taxonomy
Link to PIRSF report
Indicate pattern sequence region(s)
![Page 20: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/20.jpg)
2020
Pattern Search Result (II)B. One Query Sequence Against PROSITE Pattern Database
![Page 21: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/21.jpg)
2121
Profile Method
Profile: A Table of Scores to Express Family Consensus Derived from Multiple Sequence Alignments Num of Rows = Num of Aligned Positions Each row contains a score for the alignment with each possible
residue. Profile Searching
Summation of Scores for Each Amino Acid Residue along Query Sequence
Higher Match Values at Conserved Positions
![Page 22: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/22.jpg)
2222
Prosite PS50157 profile for Zinc finger C2H2
![Page 23: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/23.jpg)
2323
Search One Query Protein Against all the Full-length and Domain HMM models for the fully curated PIRSFs by HMMER
The matched regions and statistics will be displayed.
Shows PIRSF that the query belongs to
Statistical data for all domains
Statistical data per domain
Alignment with consensus sequence
1
PIRSF scan
![Page 24: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/24.jpg)
2424
Creation and Curation of PIRSFs
![Page 25: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/25.jpg)
2525
Integrated Bioinformatics System for Function and Pathway Discovery
Data Integration Associative Analysis
Sequence Analysis Pipeline
(Family Classification & Feature Identification)
Data Mining Tools
(Retrieval, Visualization, Analysis, Correlation)
Data Warehouse
(Gene, Protein, Family, Function, Structure, Pathway, Interaction)
Graphical User Interface
(Browsing, Querying, Navigation)
Input
(Gene/Protein Expression Data)
Output
(Analysis Results, Biological Interpretation)
Integrated Bioinformatics System
User
Input
(Local Data, Search Criteria, Report Format)
Sequence Analysis Pipeline
(Family Classification & Feature Identification)
Data Mining Tools
(Retrieval, Visualization, Analysis, Correlation)
Data Warehouse
(Gene, Protein, Family, Function, Structure, Pathway, Interaction)
Graphical User Interface
(Browsing, Querying, Navigation)
Input
(Gene/Protein Expression Data)
Output
(Analysis Results, Biological Interpretation)
Integrated Bioinformatics System
User
Input
(Local Data, Search Criteria, Report Format)
![Page 26: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/26.jpg)
2626
Analytical Pipeline
Query Sequence
UniProt
Top-Matched Superfamilies/Domains
BLAST Search HMM Domain Search
Predicated Superfamilies/Domains/Motifs/Sites/SignalPeptides/TMHs
SSEARCH CLUSTALW
Superfamily/Domain/Motif Alignments
Family Relationships & Functional Features
Family Classification & Functional Analysis
HMM Motif Search Pattern Search SignalP/TMHMM
![Page 27: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/27.jpg)
2727
Integrated Bioinformatics System
Global Bioinformatics Analysis of 1000’s of Genes and Proteins
Pathway Discovery,
Target Identification
Gene Expression Data Proteomic Data
Clustering
Expression Pattern
Visualization & Statistical Analysis
Clustered Matrix Pathway Map Process HierarchyClustered Graph
Gene/Peptide-Protein Mapping
Pathway Discovery (Browsing, Sorting, Visualization & Statistical Analysis)
Functional Analysis (Sequence Analysis & Information Retrieval)
Integrated Protein Knowledge System
Comprehensive Protein
Information Matrix
Protein List
Gene Expression Data Proteomic Data
Clustering
Expression Pattern
Visualization & Statistical Analysis
Clustered Matrix Pathway Map Process HierarchyClustered GraphClustered Matrix Pathway Map Process HierarchyClustered Graph
Gene/Peptide-Protein Mapping
Pathway Discovery (Browsing, Sorting, Visualization & Statistical Analysis)
Functional Analysis (Sequence Analysis & Information Retrieval)
Integrated Protein Knowledge System
Comprehensive Protein
Information Matrix
Protein List
Gene/Peptide-Protein Mapping
Pathway Discovery (Browsing, Sorting, Visualization & Statistical Analysis)
Functional Analysis (Sequence Analysis & Information Retrieval)
Integrated Protein Knowledge System
Comprehensive Protein
Information Matrix
Protein List
![Page 28: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/28.jpg)
2828
Lab Section
![Page 29: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/29.jpg)
2929
Rat eye lens phosphoproteomics in normal and cataractKamei et al., Biol. Pharm. Bull., 2005.
Normal Cataract(-) pI (+)
Mw
More phosphorylated spots in cataract sample.Digestion and MS from Spot 16 gave these peptides:
MDVTIQHPWFKRALGPFYPSRCSLSADGMLTFSGYRLPSNVDQSALS
We want to identify the protein(s) that contain these peptides
Use Peptide Search
MDVTIQHPWFKR
![Page 30: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/30.jpg)
3030
Peptide Search
Restrict search to an organism
![Page 31: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/31.jpg)
3131
Links to iProClass and UniProtKB reports
Link to NCBI taxonomy
Link to PIRSF report
Matching peptidehighlighted in the sequence
Sorting arrows
Peptide Search & ResultsSpecies restricted search
Search in UniProtKB, 23 proteins
![Page 32: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/32.jpg)
3232
Batch Retrieval Results (I)
Retrieve more sequences
• Retrieve multiple proteins in from iProClass using a specific identifier or a combination of them• Provides a means to easily retrieve and analyze proteins when the identifiers come from different databases
![Page 33: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/33.jpg)
3333
Blast Similarity Search
>P24623
• Perform sequence similarity search
What proteins are related to rat CRYAA?
http://pir.georgetown.edu/pirwww/search/blast.shtml
![Page 34: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/34.jpg)
3535
Pairwise Alignment
![Page 35: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/35.jpg)
3636
UniProtKBDatabaseand unique UniParc
sequences
PIR protein family classification
database
PIR Text Search ((http://pir.georgetown.edu/search/textsearch.shtml)
Let’s search for human crystallins
![Page 36: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/36.jpg)
3737
Refine your search or start over
Display PDB ID
Let’s look for crystallins which have 3D structure
![Page 37: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/37.jpg)
3838
Domain Display allows to compare simultaneously Pfam domains present in multiple proteins
Let’s perform a multiple alignment on the sequences containing PF00030
Share same domainarchitecture
![Page 38: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/38.jpg)
3939
Multiple Alignment
![Page 39: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/39.jpg)
4040
Interactive Phylogenetic Tree and Alignment
Beta B1 and gamma crystallins share the same domains, SCOP fold and share significant sequence similarity suggesting that they are related
![Page 40: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/40.jpg)
4141
Pattern Search (I)
Search for proteins containing this pattern (PS00225) in rat
Select P07320 and perform a pattern search
![Page 41: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/41.jpg)
4242
Pattern Search Result
Beta and gamma Crystallins have multiple copies of this pattern
![Page 42: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/42.jpg)
4343
PIRSF provides a single platform where all the previous analysis has been done by curators
Represents extent of manual curation
Pfam domains assigned with high confidence
Link to PIRSF report
Validation tag
![Page 43: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/43.jpg)
4444
Alpha-crystallin is exclusively found in metazoans
Taxonomic Distribution
Multiple Alignment
Domain Architecture
![Page 44: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/44.jpg)
4545
PIRSF scan
![Page 45: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/45.jpg)
4646
PIRSF report (I): a single platform to study proteins
Subfamily level
![Page 46: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/46.jpg)
4747
Cross-links to other databases
PIRSF report (II)
http://www.geneontology.org/
![Page 47: Sequence Based Analysis Tutorial NIH Proteomics Workshop Cecilia Arighi, Ph.D. Protein Information Resource at Georgetown University Medical Center](https://reader035.vdocuments.net/reader035/viewer/2022062805/5697bfdb1a28abf838cb0c26/html5/thumbnails/47.jpg)
4848
alpha-Crystallin and Related Proteins
Alpha crystallin alpha chain
Alpha crystallin beta chain
HSPs