java framework for molecular interactions2 different formats – different needs psi-xml 2.5...
TRANSCRIPT
PSI-MI standard formats
2 different formats – different needs
PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7
- complex relationships, hierarchy- more complete- binary/n-ary interactions and complexes- schema - can only be used by developers and software- complex schema sometimes confusing => difficult parsing
- simple format- easy to read (not to all)- easy to index- Cannot be used for complex relationships- Too many columns if trying to add all the information of XML
Segmentation of the tools/software
PSI-XML 2.5
- MI-XML validator- can be used to exchange fully MIMIx/IMEx compliant data
- PSICQUIC- Data best practices- clustering and scoring- can be easily used for visualization/networking
Need to unify our tools/software
PSI-MITAB 2.5, 2.6 and 2.7
Endless conversions
Common Java framework?
PSI-XML 2.5 PSI-MITAB 2.7 Databases + Other formats
Common API/framework (interfaces)
PSICQUIC and indexing
Semantic validator Data enricher
Protein update (and others)Clustering and scoring
Simplified view of the PSI-XML 2.5 schema
ParticipantParticipant ParticipantParticipant ParticipantParticipant
InteractorInteractor InteractorInteractor
FeatureFeature
RangeRangeRangeRange
FeatureFeature
ExperimentExperiment
EntryEntry
InteractionInteraction InteractionInteraction
ExperimentExperiment
ParticipantParticipant
InteractionInteraction RangeRange
1
n
EntrySetEntrySet
SourceSource
JAMI model interfaces
Different interactors
Interaction with participants
Set of interactors
How is described the interactor?
MITAB and XMLXML only
MITAB only
Interactor extensions
● More specific fields● Short cuts● Utility methods● Sequence only for proteins and
nucleic acids (Polymer)
How is described the organism?
MITAB and XMLXML only
Controlled vocabulary terms
Shortcuts
Basic objects
Different interactions
Based on experimental details
Interaction evidence
InteractorInteractor
FeatureEvidenceFeature
Evidence
Interaction Evidence
Interaction Evidence
ExperimentExperiment
ParticipantEvidence
ParticipantEvidence
RangeRange
1
n
PublicationPublication
SourceSource
ParticipantEvidence
ParticipantEvidence
ParticipantEvidence
ParticipantEvidence
InteractorInteractor
FeatureEvidenceFeature
Evidence
RangeRange RangeRangeBindingFeature
BindingFeature
RangeRange
BindingFeature
BindingFeature
RangeRange RangeRange
Inferred interactions
n
1
Differences with XML● Interaction evidence
➢ One experiment➢ One interaction type
● Experiment➢ One host organism➢ No participant identification method➢ No feature detection method
● Participant evidence➢ One experimental role➢ One expressed in organism➢ One participant identification method
● Feature➢ One feature detection method
Modelled interaction
ModelledFeature
ModelledFeature
Modelled Interaction
Modelled Interaction
ExperimentExperiment
RangeRange
n
PublicationPublication
SourceSource
ModelledParticipantModelled
ParticipantModelled
ParticipantModelled
Participant
InteractorInteractor
ModelledFeature
ModelledFeature
RangeRange RangeRangeBindingsite
Bindingsite
RangeRange
Bindingsite
Bindingsite
RangeRange RangeRange
Inferred interactions
n
1ModelledParticipantModelled
Participant
InteractorInteractor
SourceSource
Differences with XML
● Modelled interaction● Experiment not required
● Modelled participant➢ No experimental role➢ No participant identification method
● Modelled Feature➢ No feature detection method
Cooperative interaction
● Cooperative mechanism (CV term)
● Effect outcome (CV term)
● Response (CV term)
● Affected interactions (Modelled interactions)
Allosteric interaction
● Allostery mechanism (CV term)
● Allostery type (CV term)
● Allosteric molecule (modelled participant not interactor?)
● Allosteric effector (modelled participant not interactor?)
● Allosteric PTM (modelled feature)
Complexes
PSI-MI standard formats
ComponentFeature
ComponentFeature
ComplexComplex
ExperimentExperiment
RangeRange
n
PublicationPublication
SourceSource
ComponentComponent ComponentComponent
InteractorInteractor
ComponentFeature
ComponentFeature
RangeRange RangeRangeBindingsite
Bindingsite
RangeRange
Bindingsite
Bindingsite
RangeRange RangeRange
Inferred interactions
n1
InteractorInteractor
ComponentComponent
Interactionevidence
Interactionevidence1
1 n
Differences with XML
● Complex● Experiment not required● Can have a list of publications● Parameters and confidences?
● Component➢ No experimental role➢ No participant identification method➢ No biological role?
● Component Feature➢ No feature detection method
interactor only
Datasources
● Equivalent to EntrySet/Entry
● Embedded parser
● Parsing events and list of Errors (line number, col number)
● MITAB Column● Object id
JAMI core and modules
● JAMI corehttps://psimi.googlecode.com/svn/trunk/psi-jami
● JAMI MITAB parserhttps://psimi.googlecode.com/svn/branches/psimitab-parser-2.0.0-SNAPSHOT
● JAMI PSI-XML 2.5 parserhttps://psimi.googlecode.com/svn/branches/psi25-xml-2.0.0-SNAPSHOT
MITAB/PSI-XML parsing
PSI-XML 2.5 PSI-MITAB 2.7
Common API/framework (interfaces)
Current Java XML model
Current Java MITAB model
JAMI goals and benefits
● Utility methods➢ Extract all uniprot cross references➢ Compare two interactors
● Unit testing (in progress)
● Same interfaces for XML and MITAB➢ Do not duplicate code➢ Share code and tools➢ Help to develop faster
Community effort
Application example: PSI-MI validator
Validator updated source code
● MI schema validatorhttps://psimi.googlecode.com/svn/tags/psimi-schema-validator-2.1.0-SNAPSHOT
● MI schema validator in command linehttps://psimi.googlecode.com/svn/branches/psimi-schema-validator-cli-2.1.0-SNAPSHOT
● PSI-MI validatorhttp://wwwdev.ebi.ac.uk/intact/validator/
● JAMI-HTML writerhttps://psimi.googlecode.com/svn/trunk/jami-html
CV rules (1)
XML path/ XML model
CV rules (2)
JAMI model
Syntax validation
● Managed by each datasource➢ SAX validation (syntax and grammar => schema)➢ MITAB syntax validation based on events
● MITAB syntax
➢ 15, 36 or 42 columns➢ Invalid fields
➢ Missing database or database accession➢ Special characters not properly escaped➢ Missing alias db source or name➢ Missing annotation topic➢ ...
XML syntax error : missing interaction detection method
Missing interaction detection method
XML syntax validation
MITAB syntax error : wrong number of columns and missing database
MITAB syntax validation
XML CV validation
File contextError message
MITAB CV validation
File contextError message
XML MIMIx validation
MITAB MIMIx validation
Common HTML view
Next steps
● Performance review
● More unit testing
● Rules can be obsolete in MITAB
● Current rules are outdated and need to be reviewed
● PSI enricher
PSI-XML schema 2.5 issues and next steps
Minor issues: confidences
● Need an unit?● Add a confidence type
Minor issues: interaction types
• Interaction type without experiment ref
Minor issues: publications
• No concept of publication element?
• Should be able to describe publication (both Xref : pulication Id and ListOfAttribute : publication date, journal, authors, etc.) in the BibRef element of an ExperimentDescription
<bibref><xref>
<primaryRef db="pubmed" dbAc="MI:0446" id="2556388" refTypeAc="MI:0685" refType="primary-reference"></primaryRef>
</xref>
<attributeList><attribute name=”author-list” nameAc=”MI:0636”>Valentin-Ranc C, Carlier MF</attribute>
....</attributeList>
</bibref>
Complexes
• InteractionRef : need to define experiment?
Experiment confidence?
Participant experimental interactor?
Two flavours make parsing more difficult
• Mix of compact/expanded XML?
• Some elements allow experimentRef but not experimentDescription
Next steps
● Define better way to represent complexes
● Dynamic interactions?
● List of interactor as an interactor?
● Should we use namespaces => modules?
Master headline
????
??? ?
??
?
?
?
?
?
?
??
?
?
? ?
?