introduction to intact pablo porras millán, intact [email protected]
TRANSCRIPT
Session outline
• Introduction to protein-protein interactions (PPIs)
What are PPIs?
Representing PPIs
PPI databases
• IntAct: the molecular interactions database at the EBI
Data structure and curation model
Using the IntAct website
Introduction to protein-protein interactions (PPIs)
EMBL-EBI
A definition…
Protein-protein interactions (PPIs): physical and selective contacts that happen between pairs of proteins, in certain molecular regions and in a defined biological context.
Interactome: the totality of PPIs that happen in a cell / in an organism / in a specific biological context...
Proteasome image from Hook, B. and Schagat, T. [Internet] 2011. Available from: www.promega.com/resources/articles/pubhub/functional-proteomics-techniques-to-isolate-and-characterize-the-human-proteasome/
EMBL-EBI
Why protein-protein interactions?
1. To predict a protein biological function• “guilt by association”• proteins with similar functions should cluster together
2. To improve characterization of protein complexes and pathways• interaction networks work as a draft map that brings detail to
biological processes and pathways
Gene level
DNA RNA
Protein level
1 protein =
1 function
1 protein =
n functions=
n networks!
WRONG!
EMBL-EBI
Yeast-two hybrid (Y2H)
Hig
h-th
roug
hput
X-ray diffraction studies
Low
-th
roug
hput
Tandem affinity purification+ mass spectrometry (TAP-MS)
Protein-protein interaction detection methods
No single method can accurately reproduce a true binary interaction observed under physiological
conditions – every interaction detected experimentally is fundamentally artefactual.
EMBL-EBI
interaction domains
Overlap in sequence ranges:
Representing PPIs: interaction domains
EMBL-EBI
• Some experimental methods generate complex data: E. g. Tandem affinity purification (TAP)
• There are two algorithms to transform this information into binary data:
Representing PPIs: The problem with complexes
EMBL-EBIDe Las Rivas & Fontanillo, PLoS Computational biology, PMID: 20589078.
Interactions databases: types
EMBL-EBI
Primary databases: coverage and biases
Roland et al., Cell, PMID: 25416956. De Las Rivas & Fontanillo, PLoS Computational biology, PMID: 20589078.
Human PPIs coverage in the main public primary databases(Dec 2009)
Popularity bias in publicly available databases
(2013)
EMBL-EBI
A standard for PPIs representation: the IMEx consortium
www.imexconsortium.org
Orchard et al., Nature Methods, PMID: 22453911.
IntAct: The molecular interactions
database at the EBI
EMBL-EBI
1. Publicly available repository of molecular interactions (mainly PPIs) - >530K binary interaction evidences taken from >13,800 publications (September 2015)
2. Data is standards-compliant and available via our website, for download at our ftp site or via PSICQUIC
www.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml
3. Provide open-access versions of the software to allow installation of local IntAct nodes.
IntAct goals & achievements
EMBL-EBI
Entry
Publication
Experiment 1
Interaction 1
Participant 1
Features
Participant 2
Features
Interaction 2
Experiment 2
Interaction 3 Interaction 4
…… …
[A] Publication level (entry)
[B] Experiment level
[C] Interaction level
[D] Participant level
[E] Feature level
IntAct: Data storage schema
EMBL-EBI
IntAct: PSI-MI ontology
EMBL-EBI
“Lifecycle of an Interaction”
Publication(full text)
Sanity Checks(nightly)
IntAct Curation pipeline
CVs
curator
report report
Curation manual
.
reject
Super curator
annotate
p1
p2I
exp
IMEx
MatrixDB Mint DIP
Public web site
FTP siteaccept
check
EMBL-EBI
CROSS-REFERENCES
FAMILIES AND DOMAINS
InterPro
SMALL MOLECULES
ChEBI
FUNCTION
Gene Ontology
GENOME SEQUENCES
Ensembl
UniProtKB
PROTEIN SEQUENCES
LARGE DATASETS FROM HIGH-THROUGHPUT PROJECTS
PUBLISHED MOLECULAR INTERACTIONS DATA
CURATION
DIRECT SUBMISSION
Others
STRUCTURES, ORGANISM,
TISSUE...
IntAct: the role of the curator
EMBL-EBI
UniProt Knowledge Base
www.uniprot.org
Interactions can be mapped to the canonical sequence…
... to splice variants...
... or to post-processed chains
EMBL-EBI
Common curation platform
Specific Data Dissemination Platforms
General curation, large scale
General curation, domain int.
UniProt entry related
Extracellular matrix
Model organisms
Immune system
Commercial curation
Cellular mechanics
Regulatory interactions
Specific curation focus/expertise
Other DBs
Host – pathogen interactions Cardiovascular
proteins
IntAct as a common curation platform
EMBL-EBI
IntAct webpage-based search
EMBL-EBI
IntAct webpage-based search
Details of interaction
Link to external resource(UniProtKB)
Details about controlled vocabulary term describing
interaction detection method
EMBL-EBI
IntAct: changing the layout
EMBL-EBI
IntAct: download formats
EMBL-EBI
MITAB 2.7 specific columns (+27):• Expansion method(s)• Biological role(s) of interactors• Experimental role(s) of interactors• Type(s) of interactors• Properties (CrossReference) of interactors /
interaction• Annotation(s) of interactors / interaction• HostOrganism(s)• Parameters of interaction• Creation and update dates• Checksum(s) of interactors / interaction • Negative• Feature(s) interactors• Stoichiometry(s) interactors• Participant(s) identification method(s)
MITAB 2.5 Standard columns (15):
• ID(s) interactor A & B• Alt. ID(s) interactor A & B • Alias(es) interactor A & B• Interaction detection method(s)• Publication 1st author(s)• Publication Identifier(s)• Taxid interactor A & B• Interaction type(s)• Source database(s)• Interaction identifier(s)• Confidence value(s)
PSIMITAB Columns
EMBL-EBI
Interaction detail in IntAct
EMBL-EBI
Interaction detail in IntAct
EMBL-EBI
Interaction detail in IntAct
EMBL-EBI
Interaction detail in IntAct
EMBL-EBI
Filtering results
EMBL-EBI
IntAct: visualizing results as a network
EMBL-EBI
IntAct: visualizing results as a network
EMBL-EBI
IntAct: browse menu
EMBL-EBI
IntAct: Other searches
EMBL-EBI
IntAct: Advanced search
EMBL-EBI
IntAct: Advanced search
EMBL-EBI
IntAct: Advanced search
...
EMBL-EBI
IntAct: MIQL syntax search
EMBL-EBI
More about IntAct: on-line EBI courses
www.ebi.ac.uk/training/online/course/intact-molecular-interactions-ebi
EMBL-EBI
Acknowledgements
Max Koch
Developing team
Sandra Orchard
Curation team
MI Team leader
Margaret Duesbury
Birgit Meldal
Mariaestela Ortiz