the open phacts project: progress and future ...€¦ · creative commons licences – and how they...
TRANSCRIPT
The Open PHACTS Project:Progress and Future SustainabilityProgress and Future Sustainability
Lee Harland & Bryn Williams-Jones Tom PlastererOpen PHACTS / ConnectedDiscovery AstraZeneca/Open PHACTS Rep
Fundamental issue:Fundamental issue:There is a *lot* of science outside your wallsIt’s a chaotic spaceIt s a chaotic spaceScientists want to find information quickly and easilyeasilyOften they just “cant get there” (or don’t even know where “there” is)know where there is)And you have to manage it all (or not)
Pre-competitive Informatics:Pharma are all accessing processing storing & re-processing external research dataPharma are all accessing, processing, storing & re processing external research data
LiteraturePubChem
GenbankPatents Databases
DownloadsRepeat @
each xcompany
Data Integration Data Analysis Firewalled Databases
Lowering industry firewalls: pre-competitive informatics in drug discoveryNature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
The Innovative Medicines Initiative The Open PHACTS ProjectInitiative• EC funded public-private
partnership for pharmaceutical researchF k bl
The Open PHACTS Project• Create a semantic integration hub (“Open
Pharmacological Space”)…• Delivering services to support on-going drug
di i h d bli d i• Focus on key problems– Efficacy, Safety,
Education & Training,Knowledge
discovery programs in pharma and public domain• Not just another project; Leading academics in
semantics, pharmacology and informatics, driven by solid industry business requirements
Managementy y q
• 23 academic partners, 8 pharmaceutical companies, 3 biotechs
• Work split into clusters:Tehnical Build• Tehnical Build
• Scientific Drive• Community & Sustainability
The Project
“Find me compounds
“What is the selectivity profile of known p38 inhibitors?”
“Let me compare MW, logP and PSA for known oxidoreductaseinhibitors” Find me compounds
that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”
inhibitors
ChEMBL DrugBank Gene Ontology Wikipathways GeneGogy
UniProt UMLSChEBI GVKBio
ChemSpiderConceptWiki TrialTrove TR Integrity
Business Question Driven Approach Number sum Nr of 1 Question
15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse
18 14 8Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound?
24 13 8 Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives.
32 13 8 For a given interaction profile, give me compounds similar to it.
37 13 8 The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X.
38 13 8 Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not).A project is considering Protein Kinase C Alpha (PRKCA) as a target What are all the
41 13 8
A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature.
44 13 8 Give me all active compounds on a given target with the relevant assay data
46 13 8 Give me the compound(s) which hit most specifically the multiple targets in a given pathway46 13 8 Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease)
59 14 8 Identify all known protein-protein interaction inhibitors
http://www.sciencedirect.com/science/article/pii/S1359644613001542
Platform Explorer
“Provenance Everywhere”
Platform Explorer
Apps
StandardsAPI
Apps
Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices
Identity Resolution
Service
rm
“Adenosine receptor 2a”
Semantic Workflow Engine
Chemistry
IdentifierManagement
Service
re P
latfo
r
P12374EC2.43.4
CS4532
Data Cache (Virtuoso Triple Store)
Chemistry RegistrationNormalisation& Q/C
Indexing
Co
NanopubVoID
g
VoIDNanopub
VoIDVoIDNanopub
VoIDVoID
DbDbDbDbDbDbDbDb
Public Content Commercial
Public Ontologies
User Annotations
Present ContentPresent Content
Data Licensing Solution
Chose John Wilbanks as consultant
Data Licensing Solution
A framework built around STANDARD well-understood Creative Commons licences – and how they interoperate
Deal with the problems by:
Interoperable licences
Appropriate terms
Declare expectations to users and
data publishers
One size won‘t fit all requirements
Its easy to integrate difficult to integrate well:Its easy to integrate, difficult to integrate well:
What Is Gleevec?Imatinib
Mesylate
PubChemDrugbankChemSpider
Dynamic EqualityStrict Relaxed
Analysing Browsing
chemspider:gleevec drugbank:gleevec
LinkSet#1 {h id l h P t i ti ibchemspider:gleevec hasParent imatinib ...
drugbank:gleevec exactMatch imatinib ...}
Play! https://dev.openphacts.org/Play! https://dev.openphacts.org/
APPSAPPS
http://explorer.openphacts.org
Example applications
Advanced analyticsAdvanced analyticsChemBioNavigator Navigating at the interface of chemical and
biological data with sorting and plotting options
TargetDossier Interconnecting Open PHACTS with multiple target centric services. Exploring target similarity using diverse criteria
PharmaTrek Interactive Polypharmacology space ofPharmaTrek Interactive Polypharmacology space of experimental annotations
UTOPIA Semantic enrichment of scientific PDFs
PredictionsGARFIELD Prediction of target pharmacology based on the
Si il E bl A hSimilar Ensemble Approach
eTOX connector Automatic extraction of data for building predictive toxicology models in eTOX project
Uptake at AstraZeneca:Uptake at AstraZeneca: a Use Case
Applying BioAssay Ontology toApplying BioAssay Ontology to facilitate HTS analysis
Linda Zander BalderudOla Engkvist
Chemistry Innovation Centre Discovery SciencesChemistry Innovation Centre, Discovery SciencesAstraZeneca
Assay Informatics projectBenefits in Adopting BioAssay Ontology (BAO)Benefits in Adopting BioAssay Ontology (BAO)
• Common language for assay annotation• Common language for assay annotation
• Improved project success analyses based on assay technologies
• Better understand the impact of technology artifacts like frequent hitters
• Assay design and screening cascade support during assay development in early projects
I d bilit t f bi d d t i i f
FLIPR Tetra High Throughput Cellular Screening System(from Molecular Devices)
• Improved capability to perform combined data mining of internal and public data
22
The BioAssay Ontology (BAO)Comp tational Science Uni ersit of Miami USAComputational Science, University of Miami, USA
Domain:• Assay design• Assay format• Detection technology• Meta target• Endpoint
Protein Origin
Cell Line
Screening
Assay Information
• Endpoint• Perturbagen
BioAssay Ontology imports:
• NCBI taxonomy - organism names and IDs
Cascade
• NCBI taxonomy - organism names and IDs • Uniprot - protein target names and IDs• Unit Ontology - concentration and time unit terms• Ontology of Biomedical Investigation – descriptions of
biological assays
BioAssay OntologyAssay Information and Analysis
Signalling• Gene Ontology - biological processes• Cell Line Ontology - cell line names• CL – cell types• UBERON – anatomical entities• PATO – cell phenotype
Assay Success
Target hits, Results
Pathways
External Assays (PubChem CHEMBL)
23
PATO cell phenotype• SAR connect – target classifications
Migration to BAOAnnotation of HTS assaysAnnotation of HTS assays
Manual annotation of protocolsManual annotation of protocols
HTS assay: reporter gene assay• Assay method: reporter gene method: beta
lactamase induction
HTS assay: FLIPR• Assay method: molecular redistribution
determination assaylactamase induction• Detection technology: FRET• Bioassay: beta lactamase assay
• Assay kit: LiveBLAzer FRET - B/G Loading Kit
determination assay• Detection technology: fluorescence intensity• Bioassay: calcium redistribution assay
• Assay kit: Fluo-8 No Wash Calcium Assay Kit • Wavelength: ex 405 em 460, 535
• Biological process• Disease
• Wavelength: ex 480 em 530
• Biological process• Disease
Over 900 PubChem assays have been annotated by the BioAssay Ontology team
24
Assay Development SupportC i t d b t A t Z d P bCh HTSComparison study between AstraZeneca and PubChem HTS assays
412 in house HTS assays since 2005 have been annotated according to412 in-house HTS assays since 2005 have been annotated according to the BioAssay Ontology. The assay design and technology of the annotated assays were analyzed together with 239 primary assays from PubChem. The analyzed PubChem assays are biochemical assays, assays detected by luminescence and/or assays using GPCR targets.
From the annotated assays, 515 assays were using human targets and combined 311 different human targets were represented in the studycombined 311 different human targets were represented in the study.
15 of the in-house targets were also screened in at least one PubChem assay. Eight of these were GPCR
AstraZenecaGPCR targets. 194
102PubChem
15
25
Assay Development SupportD t ti T h l f AZ d P bCh Bi h i l ADetection Technology of AZ and PubChem Biochemical Assays
AAstraZeneca
PPubChem
26
Assay Development SupportAssay design of in-house and PubChem GPCR HTS
GPCR target class
27
One explanation for the low usage of cAMP redistribution method among the annotated PubChem assays could be that no class B GPCRs have been screened
Sustaining The Project
The Open PHACTS Foundation
Access to a wide range of interconnected data – easily jump between pharmacology, chemistry, disease, pathways and other databases without having to perform complex mapping operations
Benefits…..
Query by data type, not by data source (“Protein Information” not “Uniprot Information)API queries that seamlessly connect data (for instance the Pharmacology query draws data from Chembl, ChemSpider, ConceptWiki and Drugbank)Strong chemistry representation – all chemicals reprocessed via Open PHACTS chemical registry to
i t d t bensure consistency across databasesBuilt using open community standards, not an ad-hoc solution. Developed in conjuction with 8 major pharma (so your app will speak their language!)Simple, flexible data-joining (join compound data ignoring salt forms, join protein data ignoring species)species)Provenance everywhere – every single data point tagged with source, version, author, etcNanopublication-enabled. Access to a rich dataset of established and emerging biomedical “assertions”Professionally Hosted (Continually Monitored)Professionally Hosted (Continually Monitored)Developer-friendly JSON/XML methods. Consistent API for multiple servicesSeamless data upgrades. We manage updates so you don’t have toCommunity-curation tools to enhance and correct contentA t i h li ti t k ( diff t A b ild )Access to a rich application network (many different App builders)Toolkits to support many different languages, workflow engines and user applicationsPrivate and secure, suitable for confidential analysesActive & still growing through a unique public-private partnership
Kick Starting SustainabilityKick-Starting Sustainability
Open PHACTS
Apps
atio
nat
ion
ssryry ers
ers
labo
rala
bora
Gra
nts
Gra
nts
ndus
trnd
ustr
PI U
sePI
Use API
Col
ColII AA
Pfizer Limited – CoordinatorUniversität Wien – Managing entity
Spanish National Cancer Research Centre University of Manchester
NovartisMerck Serono
Technical University of Denmark University of Hamburg, Center for
Bioinformatics BioSolveIT GmBHConsorci Mar Parc de Salut de Barcelona
Maastricht University AqnowledgeUniversity of Santiago de CompostelaRheinische Friedrich-Wilhelms-Universität
Bonn
H. Lundbeck A/SEli LillyNetherlands Bioinformatics CentreSwiss Institute of BioinformaticsConnectedDiscoveryEMBL-European Bioinformatics Institute
Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam
AstraZenecaGlaxoSmithKlineEsteve
EMBL European Bioinformatics InstituteJanssenOpenLink
[email protected] @Open_PHACTS Open PHACTS