Extracting quantitative information from proteomic 2-D gels
Lecture in the bioinformatics course ”Gene expression and cell models”
April 20, 2005
John Gustafsson
Mathematical Statistics
Chalmers
Proteomics lectures:starting points
• Anders’ starting point this Monday:– Let’s say that we want to study life at the protein
level – what technologies do we have at hand?
• Today’s lecture:– How can we get (large-scale) quantitative
measurements of protein amounts? So that we can do statistics and bioinformatics
• Proteomics• The 2-D gel technology• Extracting quantitative information
– Image analysis of 2-D gels• Comparison with microarrays• Statistic analysis of quantitative 2-D gel data
Content and structure
ProteomicsDNA
mRNA
ProductionModification Degradation
Localisation
Interaction
ACTIVITY
P
TDP
Co-factors
2-D gels
2-D gel electrophoresis: Protein separation and quantification
”protein soup”
spot volume protein quantity
mo
lecu
lar
size
molecular charge
acidic alkaline
sma
llla
rge
A typical 2-D gel experiment
statistical analysis
conclusions
protein extracts
biological experimentcontrol treatment
2-D gel images
2-D gel electrophoresis
quantified data
image analysis
25211511
225221215211
125121115111
mmmm zzzz
zzzz
zzzz
matrix with
spot volume data
rows: proteins (many)
columns: gels (few)
experimental design
Example:
The image analysis task
• The task1. In each gel image: Find and quantify the protein
spots
2. In the group of gel images: Match protein spots in different images that correspond to the same protein
• Issues– automation– time
Pseudo-color superposition 1(3)0M NaCl 1M NaCl
Pseudo-color superposition 2(3)OM NaCl 1M NaCl
Pseudo-color superposition 3(3)(red: 0M NaCl, blue: 1M NaCl)
The standard solution – workflow
In each gel image1. Background subtraction
2. Spot detection
3. Spot quantification
In the group of gel images4. Spot pattern matching
1. Background subtraction
Before After
- =
2. Spot detection / image segmentation
3. Spot quantification
spot volume protein quantity
4. Spot pattern matching
The typical 2-D gel experiment
statistical analysis
conclusions
protein extracts
biological experimentcontrol treatment
2-D gel images
2-D gel electrophoresis
quantified data
image analysis
25211511
225221215211
125121115111
mmmm zzzz
zzzz
zzzz
matrix with
spot volume data
rows: proteins (many)
columns: gels (few)
experimental design
Example:
Limitations
• Technological– hydrofobic proteins don’t
dissolve– limited pI/size coverage– limited labeling/staining
• Image analytical– Limited global matching
efficiency of automatic algorithms
– Need for time consuming manual guidance
– ”The image analysis bottle-neck”
Limited global matching efficiency
Voss and Haberl (2000)
Incomplete spot detection: Faint spots
Detected
Not detected
Incomplete spot detection:Close spots
• Proteomics• The 2-D gel technology• Extracting quantitative information
– Image analysis of 2-D gels• Comparison with microarrays• Statistic analysis of quantitative 2-D gel data
Content and structure – revisited
Comparison with microarrays
2-D gels Microarrays
Labeling one channel* one or two-color
Background subtr. yes yes
Spot detection HARD easy
Spot quantitation can be difficult quite easy
Spot matching HARD known
Identification MS or reference atlas known
*) recently also two-color
Variability
normal 1M NaCl
normal 1M NaCl
biol
ogic
al r
eplic
atio
ns
growth condition
Variance versus mean dependence
• A dot in the plot:– the measurement of one
protein
• The quadratic dependence indicates a multiplicative error structure
(2x5 gel set; normal growth condition)
slope=2 variance mean2
Why transform the data?
• A mathematical data transformation can be used to – Make errors more normally distributed– Stabilize variance versus mean dependence
• Then the model on transformed scale is more simple than on original scale
• Simplifies the subsequent analysis
Logarithmic data transformation
• Stabilized variance versus mean dependence after a logarithmic data transformation
(2x5 gel set; normal growth condition)
Statistical analysis of quantitative 2-D gel data
Examples:• Test of differential expression• Cluster analysis
– cluster proteins – cluster cell/tissue samples
• Classification– classify tissue samples (i.e. tumor classes)
• Proteomics• The 2-D gel technology• Extracting quantitative information
– Image analysis of 2-D gels• Comparison with microarrays• Statistic analysis of quantitative 2-D gel data
Summary
An alternative approach to the matching problem
• The standard solution– First spot detection– Then matching of point patterns
• An alternative, recent approach– Matching at the pixel level– Computationally heavy
Gel matching at the pixel level
Reference image
Image warping
Original image Aligned image
Future alternatives to quantitative 2-D gels?
• Quantitative masspectrometry• Protein arrays