intrinsically disordered proteins: from lack of structure to pleiotropy of functions
DESCRIPTION
Intrinsically Disordered Proteins: from lack of structure to pleiotropy of functions. Lilia Iakoucheva University of California, San Diego. OUTLINE. Characterization and properties of IDPs Functional repertoire of IDPs Post-translational modifications and disorder - PowerPoint PPT PresentationTRANSCRIPT
Intrinsically Disordered Proteins: from lack of structure to
pleiotropy of functions
Lilia Iakoucheva
University of California, San Diego
OUTLINE
Characterization and properties of IDPs
Functional repertoire of IDPs
Post-translational modifications and disorder
Importance for molecular recognition
Disorder and diseases
Historical perspective1894 - Emil Fischer’s “lock-and-key” hypothesis:
1950 – Fred Karush “Configurational adaptability”
1958 – Daniel Koshland “Induced fit” theory
Amino Acid Sequence 3D Structure Function
Protein Structure-Function Paradigm
Tail of histone H5 (Aviles et al, Eur. J. Biochem. 1978) … and later tails of other histones
First examples of disorder:
95-residue long disordered segment of calcineurin (Kissinger et al, Nature, 1995)
Cyclin-dependent kinase inhibitor p21Waf1/Cip1/Sdi1 (Kriwacki et al, PNAS, 1996)
Etc…
Disorder examples
Some proteins/regions could function without being folded…= disordered
Re-assessing structure-function paradigm
Amino Acid Sequence 3D Structure Function
Amino Acid SequenceOrder
Disorder
Function
Protein regions (or entire proteins) lacking stable II and III structure and existing in the ensemble of conformations with dynamically changing Ramachandran angles
Disorder is experimentally detected by• X-ray crystallography• NMR spectroscopy• circular dichroism (CD)• limited proteolysis (LP)• hydrodynamic methods
What is disorder?
Bracken et al, Curr Opin Struct Biol. 2004, 570; Receveur-Bréchot et al, Proteins, 2006, 24
“I don’t know about hair care, Rapunzel, but I’m thinking a good cream rinse plus PROTEIN conditioner might just solve both our problems.”
DISORDER
Compositional bias
Properties of IDRs and IDPs
Order-promoting
Disorder-promoting
Dunker et al, 2001, JMGM; Radivojac et al, 2007, Biophys J
ResiduesC W Y I F V L H T N A G D M K R S Q P E
DisP
rot-
Ord
er/O
rder
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6DisProt 4.9 (2009) DisProt 3.4 (2006)
↓Aromatic,hydrophobic
↑Charged,hydrophilic
Charge-hydropathy plot
Uversky et al, 2000, Proteins 41:415-427
↑ Net charge↓ Hydrophobicity
↓ Net charge↑ Hydrophobicity
Disorder predictionAA sequence codes for protein structure…
Does AA sequence code for the lack of structure?Keith Dunker group – first Predictor Of Natural Disordered Regions PONDR
• amino acid composition• sequence complexity• net charge• hydrophobicity• flexibility• …and other features
Protein Disorder Predictors
The PONDR-FIT meta-predictor combines several methods. Use it and other predictors here. Xue, B., R. L. DunBrack, R.W. Williams, A.K. Dunker, and V. N. Uversky (2010) "PONDR-Fit: A meta-predictor of intrinsically disordered amino acids," Biochim. Biophys. Acta (in press) doi:10.1016/j.bbapap.2010.01.011
PONDR-FITTM
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB. "Protein disorder prediction: implications for structural proteomics." Structure. 2003;11(11):1453-9, PMID: 14604535
DisEMBLTM
Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. "Prediction and functional analysis of native disorder in proteins from the three kingdoms of life." J Mol Biol. 2004;337(3):635-45, PMID: 15019783
DISOPRED2
MacCallum B. "Order/Disorder Prediction With Self Organising Maps." CASP 6 meeting, Online paper DRIPPRED
Cheng J, Sweredoski M, Baldi P. "Accurate Prediction of Protein Disordered Regions by Mining Protein Structure Data" Data Mining and Knowledge Discovery. 2005; 11(3):213-222, Online Paper
DISpro
Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman I, Sussman JL. "FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded." Bioinformatics. 2005;21(16):3435-8, PMID: 15955783
FoldIndex©
Linding R, Russell RB, Neduva V, Gibson TJ. "GlobPlot: Exploring protein sequences for globularity and disorder." Nucleic Acids Res. 2003;31(13):3701-8, PMID: 12824398 GlobPlot 2
Dosztanyi Z, Csizmok V, Tompa P, Simon I. "IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content." Bioinformatics. 2005;21(16):3433-4, PMID: 15955779
IUPred
Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK. "Sequence complexity of disordered protein." Proteins. 2001;42(1):38-48, PMID: 11093259 PONDR®
Coeytaux K, Poupon A. "Prediction of unfolded segments in a protein sequence based on amino acid composition." Bioinformatics. 2005;21(9):1891-900, PMID: 15657106 PreLink
Yang ZR, Thomson R, McNeil P, Esnouf RM. "RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins." RONN
http://www.disprot.org/predictors.php
PONDRing XPA
XPA-MBD
Structure of the full-length XPA ???
Ikegami et al,1998, Nat.Struct.Biol.
PONDR in action
Iakoucheva et al, Prot Science 2001
Protein-protein interaction sites are mapped to
disordered XPA termini XPA’s phosphorylation site is located in
its disordered C-terminus Putative XPA nuclear localization signals
(NLS) are located in disordered regions
DNA BD
NLS NLS
RPATFIIHRPA
ERCC1
Functional importance
P-site
DDB2
Disorder and Functions
Function Description Examples
Protein modification
Phosphorylation, acetylation, glycosylation, methylation, ubiquitination, fatty acylation
histones, 4-E BP, CFTR, Bcl-2, neuromodulin, HMG-I(Y), p53
Molecular recognition
Protein-DNA, protein-RNA, protein-protein, protein-ligand interactions
p53, max, fos, jun, myc, α-synuclein, CDK inhibitors p21, p57, p27, TF
Macromolecular
assembly
Phages, viruses, bacterial flagellum, ribosome, spliceosome, nuclear pore
flagellin, SR proteins, ribosomal prot, Nups
Entropic chains
Flexible linkers, entropic springs, bristles
fd g3p, RPA, titin, neurofilament HDunker et al, 2002, Biochemistry
Advantages of being disordered
Low-affinity/high-specificity binding Broad binding diversity Ability to form large interaction surfaces
Greater capture radius (“fly-casting” mechanism) Facilitate alternative splicing Facilitate post-translational modifications
DisPhoshttp://core.ist.temple.edu/pred/
Phos-sites prefer IDRs
Gsponer et al, 2008, Science. 322(5906):1365-8
More kinases that target IDPs!
More kinase targets are IDPs!
Ubiquitination and disorder
IDPs are susceptible to proteasomal degradation
Unstructured initiation site is required for degradation (Prakash et al, 2004, Nat Struct Mol Biol.)
PEST motifs are disordered (Singh et al, 2006, Proteins)
Low coverage of known Ub sites by PDB
Examples of Ub sites in IDRs (p53, c-myc, cyclin B, securin, p21, p27, p57, α-synuclein, IκBα etc, various authors)
β-catenin peptide: 15 out of 26 aa are disordered
Wu et al, 2003, Molecular Cell, Vol. 11, 1445–1456
Hao et al, 2005, Molecular Cell, Vol. 20, 9-19
p27 peptide:14 out of 24 aaare disordered
Ub ligases
~60A° gap
Ub sites properties
net chargedisorder
B-factor D, E
hydrophobics
Mea
n Va
lue
-0.2
0.0
0.2
0.4
0.6
0.8Ub sites non- Ub sites
Negative chargeD and EK and hydrophobicsDisorderPredicted B-factors
Ub sites:
Identified 145 new Ub sites with MudPit, mass-spec SILAC and mutant (grr1Δ and cdc34tm) yeast strains to target short-lived proteins
UbPredhttp://www.UbPred.org
Radivojac et al, Proteins, 2010
Radivojac et al, Proteins, 2010
Structural Model of the Dynamic pSic1-Cdc4 Complex
Sic1 contains 9 phosphorylation sites, which interact with Cdc4 in a dynamic equilibrium
Directly interacting residues are transiently ordered, whereas the rest of Sic1 remains disordered even in the complex
The disorder of Sic1 helps to bridge the 64A gap between E2 (Cdc34) and the Sic1 bound to Cdc4 for ubiquitin transfer
Mittag et al, Structure, 2010
Dynamic disorder of Sic1 bound to Cdc4
Disorder and Functions
Function Description Examples
Protein modification
Phosphorylation, acetylation, glycosylation, methylation, ubiquitination, fatty acylation
histones, 4-E BP, CFTR, Bcl-2, neuromodulin, HMG-I(Y), p53
Molecular recognition
Protein-DNA, protein-RNA, protein-protein, protein-ligand interactions
p53, max, fos, jun, myc, α-synuclein, CDK inhibitors p21, p57, p27, TF
Macromolecular
assembly
Phages, viruses, bacterial flagellum, ribosome, spliceosome, nuclear pore
flagellin, SR proteins, ribosomal prot, Nups
Entropic chains
Flexible linkers, entropic springs, bristles
fd g3p, RPA, titin, neurofilament H
Molecular recognition
Disordered regions are commonly used for binding to multiple partners…
C-terminus of p53 NCBD domain of CBP/p300
Oldfield et al, 2008, BMC Genomics. 9 Suppl 1:S1 Wright and Dyson, 2009, Curr Opin Struct Biol.
Mechanisms of binding for IDPsHow do disordered proteins bind to their targets?
Induced folding(binding, … then folding)
Conformational selection(folding, … then binding)
Coupled/synergistic(folding and binding,
… or even binding without folding)(CFTR R and NBD1 domains, Baker et al, 2007, Nat Struct Mol Biol, 14:738)
>=30 >=40 >=50 >=60 >=70 >=80 >=90 >=1000
20
40
60
80 hubsendsorder
prot
eins
, %
length of predicted disordered region, aa
Are disordered proteins network hubs?
Hubs and disorder
Yeast PPICytoskeletal hubs subnetwork from the S.cerevisiae interactome
Haynes et al, 2006, PLoS CB
Ordered hubs – disordered partners14-3-3 proteins – signal transduction, apoptosis, cell cycle, cancer
>200 binding (mostly phosphorylated) targets
Three different predictors indicate that 14-3-3 TARGETS arehighly disordered (Bustos and Iglesias, 2006, Proteins, 63:35–42)
Peptides bind to essentially the same region of 14-3-3
Differences in 14-3-3 side chains conformations(e.g. induced fit mechanism)
Peptides are highly hydrated in the bound state(e.g. likely disordered in the unbound state)
Oldfield et al, BMC Genomics, 2008, 9(Suppl 1):S1
Disorder and disease
Individual examples of IDPs/IDRs involved in human diseases:
p53 (cancer), BRCA1 (cancer), a-synuclein (PD, AD, dementia, Down syndrome), amyloid b (AD), tau (AD), prion (TSEs), amylin (Type II diabetes), hirudin and thrombin (CVD), HPV (cancer)…
Increased amount of disorder in E6 proteins from high-risk HPVs
Human Papillomavirus (HPV)
Uversky et al, 2006, JPR, 5 (8), 1829-1842
Are disease proteins more disordered in general?
BRCA1Mark et al, J Mol Biol. 2005, 345(2):275-87
CD and NMR of fragments-all disordered
CH plot of BRCA1 fragmentsBRCA1 fragmentsFull-length BRCA1
Disordered proteins
Ordered proteins
Length of disordered region
>=30 >=40 >=50 >=60 >=70 >=80 >=90 >=100
Prot
eins,
%
0
20
40
60
80cancer- associated proteinsdisease- related proteinstypical eukaryotic proteinsordered
Disease-related SW keywords arestrongly associated with predicted disorder (p>0.95)
Disorder and disease
Xie et al, 2007, JPR
Disease-associated mutations
Structure: - Folding - Oligomerization - Stability - Activity
…
Function:- Post-translational
modifications- Binding to partners- Intracellular localization …
Disease mutations impact protein
Many predictors of the functional impact of SNPs are available (SIFT, POLYPHEN, SNP3D etc)
Majority rely on known protein 3D structure and evolutionary conservation
Do disease mutations even occur in the regions of disorder?
Disordered regions:
Do not fold into 3D structure
Are generally less evolutionary conserved than ordered regions
Do current predictors make errors in predictingimpact of disease mutations in IDRs?
Disease-associated mutations
DatasetTotal
mutations number
DM 15459 3356 21.7% 12103 78.3%
Poly 24220 9790 40.4% 14430 59.6%
NES 60339 26927 44.6% 33372 55.3%
IDR Mutations OR Mutations
Disease-associated mutations
Disease mutations are prevalent in ORDERED regions
Dataset I DR Mutations D- >D D- >O OR
Mutations O- >O O- >D
DM 3356 80.0% 20.0% 12103 95.1% 4.9%
Poly 9790 88.5% 11.5% 14430 95.1% 4.6%
NES 26927 92.7% 7.3% 33372 94.4% 5.6%
Disorder-to-Order transition
Some disease mutations in disordered regions cause Disorder-> Order transition
(may disrupt disordered structure? induce order?)
p=1.06E-32p=5.47E-105
Substitution D→O disease mutations, % Substitution O→D disease
mutations, %R→W 13.1 L→P 11.9R→C 10.3 C→R 6.6R→H 7.6 G→R 6.1E→K 6.7 W→R 4.1R→Q 6.3 F→S 3.6
44% 32.2%
D→O and O→DD→O O→D
C W Y I F V L H T N A G D M K R S Q P E0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17SW_DAMUSW_POLYSW_CONTROLALL_SW
Residue
% a
ll m
utat
ed r
esid
ues
Arginine is often mutated
Hypothetical mechanism?
Codons for Arginine:
CpG methylationCGGCGTCGCCGAAGAAGG
TGGTGTTGCTGAAGAAGG
R-> WR-> CR-> CR-> StopN/AN/A
R-> W and R-> C are among the most frequent mutations in the disease dataset
Disease ModelsDisorder-centric vs Structure-centric view at disease mutations
IDRs summary Proteins can carry intrinsically disordered regions
These regions can be predicted from sequence
IDRs perform important functional roles: PTMs, molecular recognition, involvement in diseases
Disease mutations could occur in IDRs, and ORand IDR mutations could lead to diseases via different mechanisms
Acknowledgements
Rockefeller University:Jurg Ott
Chad HaynesFei Ji
PNNL:Eric Ackerman
Richard D. Smith
Columbia University:Vladimir Vacic
Indiana University Predrag Radivojac
Mark GoeblKeith Dunker Funding:
Disordered Proteins Database DisProthttp://www.DisProt.org
List of Disorder Predictorshttp://www.disprot.org/predictors.php
Phos Sites Predictor DisPhoshttp://www.ist.temple.edu/disphos/
Ub Sites Predictor UbPredhttp://www.ubpred.org/
[email protected] – Lilia Iakoucheva
Prevalence of IDPs in nature
Kingdom
# Genomes
% Sequences L > 40*
Bacteria
22 7 - 33Archaea
7 9 - 37Eukaryota
5 52 - 67
16 - 4521 - 51
35 - 51
% Sequences L > 40**
* VL-XT Predictor, order ~ 78%, disorder ~ 65%** VL2 Predictor, order ~ 83%, disorder ~ 75%