bioinformatics practical for biochemists · bioinformatics practical! for! biochemists! andrei...
TRANSCRIPT
Bioinformatics Practical ���for���
Biochemists
Andrei Lupas, Birte Höcker, Steffen Schmidt WS 2013/14
03. Sequence Features
Günter Blobel, 1999, nobelprize.org
Targeting proteins
• signal peptide • targets proteins to the secretory pathway
• N-terminal sequence recognized while peptide is still synthesized on the ribosome
Nielsen et al. (2007)
Signal Peptide Prediction
• Sequence Logo of eukaryotic signal peptides
Signal Peptide Prediction - SignalP
http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=53329BB900004A2504297A29&wait=20
http://www.cbs.dtu.dk/services/SignalP/
PDB-id: 1c3w
Transmembrane Helices • unusually long stretch of hydrophobic residues
• >18 hydrophobic amino acids
• hydrophobic interaction with lipids in membrane
• orientation of helix / topology of the protein • looking at the “loops”: R & K mainly found on cytoplasmic side “positive inside rule”
TGRPEWIWLALGTALMGLGTLYFLVKGMGVSDPDAKKFYAITTLVPAIAFTMYLSMLLGYGL
N
C
Sonnhammer et al. (1998)
Transmembrane Helices – TMHMM
• http://www.cbs.dtu.dk/services/TMHMM/ • Accuracy of predicting TM helices high > 90%
• Accuracy of predicting the topology prediction > 75%
Transmembrane Helices – TMHMM
• http://www.cbs.dtu.dk/services/TMHMM/
http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5332A5E70000681F58EBAC3B&wait=20
William (1987) Biochim Biophys Acta
Secondary Structure – amino acid preferences
∝-helix" β-strand" β-turn "
Glu" 1.59" 0.52" 1.01"
Ala" 1.41" 0.72" 0.92"
Leu" 1.34" 1.22" 0.57"
Met" 1.3" 1.14" 0.52"
Gln" 1.27" 0.98" 0.84"
Lys" 1.23" 0.69" 1.07"
Arg" 1.21" 0.84" 0.9"
His" 1.05" 0.8" 0.81"
Val" 0.9" 1.87" 0.41"
Ile" 1.09" 1.67" 0.47"
Tyr" 0.74" 1.45" 0.76"
Cys" 0.66" 1.4" 0.54"
Trp" 1.02" 1.35" 0.65"
Phe" 1.16" 1.33" 0.59"
Thr" 0.76" 1.17" 0.9"
Gly" 0.43" 0.58" 1.77"
Asn" 0.76" 0.48" 1.34"
Pro" 0.34" 0.31" 1.32"
Ser" 0.57" 0.96" 1.22"
Asp" 0.99" 0.39" 1.24"
PDB: 1kgs, Buckler et al. (2002), Structure
Secondary Structure – buried ß-sheet
PDB: 1kgs, Buckler et al. (2002), Structure
Secondary Structure amphiphilic partially buried ∝-helix
PDB: 1jat, VanDenmark et al. (2001), Cell
Secondary Structure – amphiphilic ß-strand
Secondary Structure – collagen
Secondary Structure Prediction
• Toolkit – Quick2D
Secondary Structure Prediction
• Toolkit – Ali2D
http://www.nature.com/news/2011/110309/full/471151a/box/1.html
Disordered Regions
• Today, programs predict that about 40% of all human proteins contain at least one intrinsically disordered segment of 30 amino acids or more, and that some 25% are likely to be disordered from beginning to end.
• lack of hydrophobic residues
• often with overrepresentation of a few amino acids
Secondary Structure Prediction
• Toolkit - Ali2D
Disorder Prediction
• IUPRED - http://iupred.enzim.hu/
Short Linear Motifs
• Eukaryotic Linear Motifs (ELM) / Short Linear Motifs (SLiM)
• Hunt T (1990) “These motifs are linear, in the sense that three- dimensional organization is not required to bring distant segments of the molecule together to make the recognizable unit. The conservation of these motifs varies: some are highly conserved while others allow substitutions that retain only a certain pattern of charge across the motif.”
Short Linear Motifs – Characteristics • 3-11 amino acids long
• poorly conserved / evolve fast
• 1-3 amino acids in the motifs are “hot spots”
• ~ 80% in disordered regions
• relatively low affinity to interacting partner (1-150µM)
• interaction via induced fit
Short Linear Motifs – Function • protein-protein interactions
• post-translational modifications
• e.g. Phosphorylation
• proteolytic cleavage/processing sites
• KEN / D box in cell cycle - degradation signals
• subcellular targeting sites
• NES - nuclear export signal
➡ modulation of interactions - fine tuning
David Goodsell, http://www.rcsb.org/pdb/101/motm.do?momID=85
Short Linear Motifs – Nuclear Localization Signal (NLS) • Impor'n-‐beta (1qgk; blue) recognizes nuclear pores and moves through them. It wraps around the end
of importin-alpha (1ee5; green), an adaptor molecule that connects importin-beta with the cargo, here nucleoplasmin(1k5j; yellow), a chaperone important in nucleosome assembly. All interactions are mediated by linear motifs in unstructured segments (bipartite nuclear localization signals).
ELM Resources
• elm.eu.org
• NUPL_XENLA
NLS in nucleoplasmin • Quick 2D secondary structure prediction for nucleoplasmin, showing the unstructured
C-terminal tail and the bipartite nuclear localization motif
50 100 ! | | | | | | | | | | ! MASTVSNTSKLEKPVSLIWGCELNEQNKTFEFKVEDDEEKCEHQLALRTVCLGDKAKDEFHIVEIVTQEEGAEKSVPIATLKPSILPMATMVGIELTPPVTFRLKAGSG!SS PSIPRED EEEEEEEE EEEE EEEEEEEEE EEEEEE EEEEEEEE EEE EE EEEEEE !SS JNET EEEEEEE EEE HHHHHHHHHHHH EEEEEEEE EEEEEE EEEEEE !DO DISOPRED2 DDDDDDDDDDDDDDDDD !DO IUPRED DDD D DDDD DDD D D DDD DDDDDDD DDDD D DDDD DDDD !SO Prof (Rost) B B B BBBBB B B B B B BBB BBB BBBB BB B BB B BB BB BBBB B B B B !SO JNET B BBBBBBBB B B B B B BBBBBBBBBBB B BBBBBBB B BBBBBB B B BBBB B B B BBB B B!! 150 ! | | | | | | | | | |! PLYISGQHVAMEEDYSWAEEEDEGEAEGEEEEEEEEDQESPPKAVKRPAATKKAGQAKKKKLDKEDESSEEDSPTKKGKGAGRGRKPAAKK!SS PSIPRED EEEEEEEEEE HH !SS JNET EEE E !DO DISOPRED2 DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD!DO IUPRED DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD!SO Prof (Rost) BBB BBB B B B B B !SO JNET BBBBBBBBBB !! SS = Alpha-Helix Beta-Sheet Secondary Structure!DO = Disorder!SO = Solvent accessibility (A burried residue has at most 25% of its surface exposed to the solvent.)!
DDX6 & Scd6 / EDC3 Interaction
DEAD-box helicase
FDF…FDK YjeF LSM EDC3
DDX6/Me31B
Scd6/Tral LSM FDF
DDX6 & Scd6 / EDC3 Interaction
PDB: 2wax, Tritschler et al. (2009), Mol Cell
DDX6 & Scd6 / EDC3 Interaction