analyzing user interactions with biomedical ontologies: a visual perspective
TRANSCRIPT
AnalyzingUserInteractionswithBiomedicalOntologies:AVisualPerspective
TuesdayTalk,14thNovember2017
MAU L I K K AMDA R , S IMON WA L K , T A N I A T U DO R A C H E , MA R K MU S E N
StanfordCenterforBiomedicalInforma:[email protected]
Uses: Knowledge management, decision support, semantic search, data annotation, data integration, reasoning …
…
Uses: Knowledge management, decision support, semantic search, data annotation, data integration, reasoning …
…
hGp://bioportal.bioontology.org/
hGp://bioportal.bioontology.org/
hGp://bioportal.bioontology.org/
hGp://bioportal.bioontology.org/
hGp://bioportal.bioontology.org/
Whatthistalkisabout…
• BiOnIC-CatalogofUserInteracRonswith
BiomedicalOntologies
• VisIOnapplicaRonwithembeddedvisualizaRons.
AnalysisofBioPortalWebUI
exploraRonandAPI
queryingstrategies,and
correlaRonwithusage.
Whatthistalkisabout…
• BiOnIC-CatalogofUserInteracRonswith
BiomedicalOntologies
• VisIOnapplicaRonwithembeddedvisualizaRons.
AnalysisofBioPortalWebUI
exploraRonandAPI
queryingstrategies,and
correlaRonwithusage.
BenefitsofanalyzinguserinteracRons
Ø OntologyEngineers:v IdenRfyexploraRonandqueryingpaGernsv Understandontologyusageandreusev PruneunwantedclassesandrelaRons
Ø OntologyRepositoryMaintainers:v Categorizeuserbehaviorsv Developintelligentinterfacesv ProvidetargetedrecommendaRons
Ø BiomedicalResearchers:v IdenRfytemporalresearchtrends
v IdenRfyfrequentlyaccessedclasses
BiOnIC:ACatalogofUserInteracRonswith
BiomedicalOntologies
hGp://onto-apps.stanford.edu/bionic/datasets
BiOnICdatasetscreaRon
NCBO API Usage
API R
eque
sts
per M
onth
2013−O
ct
2014−J
an
2014−A
pr
2014−J
ul
2014−O
ct
2015−J
an
2015−A
pr
2015−J
ul
2015−O
ct
2016−J
an
2016−A
pr
2016−J
ul
2016−O
ct
2M8M
32M
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
NCBO Website Traffic
Occ
urre
nces
per
Mon
th
2009−J
an
2010−J
an
2011−J
an
2012−J
an
2013−J
an
2014−J
an
2015−J
an
2016−J
an
010
0K20
0K
Page RequestsUnique IP Addresses
BiOnICdatasetscreaRon
• Removingrobot/invalid
requests
• Normalizingontology
idenRfiersandclassIRIs
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
• January2015version.
• Ontologiesshouldhave
classesthatarereusedby
othersORreuseclasses
fromotherontologies.
• Ontologiesshouldhave
minimumof10unique
usersviaWebUIandAPI
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
ClassSta:s:csDatasets
Foreachclassineachontology:
• AccessANributes:o TotalIPRequests(WebUI/API)
o UniqueIPRequests(WebUI/API)
• ReuseANributes:o Numberofontologiesreusingaclass
• StructuralANributes:o Numberofparent/child/siblingclasses
o Depthfromontologyroot
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
ClassDepth->
2a
1
3a 4a
4b
4c2b
3b
3c
1’
2a’
2b’
2c’
3a’
3b’
3c’
2a’ 1’ 2b’ 3b’
2a 3a 4a 3a1
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
UserInterac:onSequencesDatasets
Ontology1
Ontology2
Filtering
AccessLogs
Filtering
Ontologies
CompuRng
ClassCounts
CompuRng
Sequences
Anonymizing
Data
BiOnICdatasetscreaRon
Anonymiza:onSteps
• IPaddressesanonymizedusinguniqueSHA-224hash-encoded
useridenRfiersgeneratedfrom“user_<Random
String>_<Random_Integer>”.
• e.g.39fd4e6d569a034973g61bb392a694d4eabe1ef98c43ee68ca2fc86
• AbsoluteTime-stampsconvertedtorelaRveRme-stamps,with
respecttofirstinteracRonwithBioPortalrepository.
• e.g.0,2757,2786,3586,3618,3803,3959,4047,5111(s),…
BiOnICschematomodelstaRsRcsandsequencesdata
countStat
bionic:CountStat
bionic:ReuseCount- reuseType
- reusingOntologies
bionic:RequestCount- accessType
- year
- totalUsers
- uniqueUsers
prov:Agent bionic:Sequence- accessType
- totalTime
- uniqueClasses
bionic:SeqEn:ty- rela6veTimestamp
bionic:Ontology- skos:prefLabel
- totalClasses
- maxDepth
owl:Class-skos:prefLabel
skos:Collec:on skos:Concept
begin
end
nextEn6ty
class
class
requests
skos:member
bionic:SeqDataset- accessType
bionic:StatDatasetdcat:Dataset
sequence
classInfo
ontology
ontology
bionic:ClassInfo- siblings
- directParents
- directChildren
- classDepth
class
subClassOf
ontology
SKOS,PROVandDCATstandardsarereusedintheBiOnICschema.
hGp://onto-apps.stanford.edu/bionic/datasets
BiOnICdatasets
hGp://onto-apps.stanford.edu/bionic/datasets
BiOnICdatasets
hGp://www.rdjdt.org/
hGp://onto-apps.stanford.edu/bionic/datasets
BiOnICdatasets
hGp://www.rdjdt.org/
SPARQLServer
hGp://onto-apps.stanford.edu/bionic/datasets
BiOnICdatasets
hGp://www.rdjdt.org/
SPARQLServer
BioPortalSPARQLEndpoint
hGp://onto-apps.stanford.edu/bionic/datasets
BiOnICdatasets
hGp://www.rdjdt.org/
SPARQLServer
BioPortalSPARQLEndpoint
1. HowmanyagentsclickonthesubclassesaIerexploringthe
“ProteinTransmembraneAc6vity”classinGeneOntology?
2. Whatistheaverage6mespentbyagentsonclassessimilarto
“CellGrowth”classinGeneOntology?
VisIOn(VisualizingOntologyInteracRons)WebApplicaRon
hGp://onto-apps.stanford.edu/vision
VisualizingOntologyStructurewithCountData
Force-directedNetwork IndentedTree
VisualizingOntologyStructurewithCountData
Force-directedNetwork IndentedTree
PolygOntovisualizaRonsofinteracRonsequences
Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)
PolygOntovisualizaRonsofinteracRonsequences
Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)
PolygOntovisualizaRonsofinteracRonsequences
Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)
PolygOntovisualizaRonsofinteracRonsequences
Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)
OntologyStructure
PolygOntovisualizaRonsofinteracRonsequences
Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)
UserSequence
OntologyStructure
PolygOntovisualizaRonsofinteracRonsequences
Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)
UserSequence
SingleClick
OntologyStructure
Whatthistalkisabout…
• BiOnIC-CatalogofUserInteracRonswith
BiomedicalOntologies
• VisIOnapplicaRonwithembeddedvisualizaRons.
AnalysisofBioPortalWebUI
exploraRonandAPI
queryingstrategies,and
correlaRonwithusage.
CharacterisRcsoftheBiOnICCatalog
• WebUIAccess:5.4Mclassrequests,1Muniqueagents
• APIAccess:67.2Mclassrequests,205Kuniqueagents
• 255biomedicalontologies
• DoBioPortalWebUIexploraRonandAPI
queryingstrategiescorrelatewitheachother?
• DoBioPortalWebUIexploraRonandAPI
queryingstrategiesinformontologyusage?
– Ontologyusage,inthiscontext,meansreusein
otherontologies,andusageindataannotaRon.
InterfaceinfluencesinbrowsingandqueryingNum
bero
fUniqu
eAP
IUsers(LogScale)
NumberofUniqueWebUIUsers(LogScale)
1000
10
100
10 100 1000
1
1
Certainclassesbrowsedorqueriedsignificantlymore.
InterfaceinfluencesinbrowsingandqueryingNum
bero
fUniqu
eAP
IUsers(LogScale)
NumberofUniqueWebUIUsers(LogScale)
1000
10
100
10 100 1000
FemaleReproduc:ve
System
1
1
Certainclassesbrowsedorqueriedsignificantlymore.
Dermis
InterfaceinfluencesinbrowsingandqueryingDysmorphicSyndrome
Nightblindness
Num
bero
fUniqu
eAP
IUsers(LogScale)
NumberofUniqueWebUIUsers(LogScale)
1000
10
100
10 100 1000
FemaleReproduc:ve
System
1
1
Certainclassesbrowsedorqueriedsignificantlymore.
Dermis
ExploraRonandQueryingbehavioralpaGerns
• Certainclassesinthelowerlevelsoftheontologicalhierarchyarerarelybrowsedandqueried–thismaybeanarRfactoftheindentedtreevisualizaRon.
• Moretriangularpolygons(1parent->2childrenclasses,or2parents->1childclass)observedinWebUIAccesspolygonduetoindentedtreevisualizaRon.
ExploraRonandQueryingbehavioralpaGerns
300,000+classes200,000+users
DoBioPortalWebUIexploraRonandAPI
queryingstrategiescorrelatewitheachother?
• SmallproporRonsofontologicalcontentexploredorqueriedfor
severalontologies,andminimalSpearmanCorrelaRonand
JaccardSimilarityobservedbetweenconsumpRonstrategies.
• VaryingconsumpRonstrategies(triangularpolygons,blind
querying)acrosstheinterfaces–thesestrategiesaregenerallyuniformacrossontologies,withafewexcepRons(ChEBI).
• ClassesinthelowerlayersoftheontologicalhierarchyarerarelybrowsedorqueriedusingtheWebUIorAPI.
Ontologyusage:Reuseinotherontologies
Dis:nctclasssetsreusedinotherontologies
Ontologyusage:GWASCatalogandPubChemannotaRons
Classesinthehigherlayersoftheontological
hierarchyareveryabstractandareneverusedfor
dataannotaRons.
Commonconsensusacross12differentdatasources
inLinkedOpenDatacloud,whereChEBIontologyis
usedfordataintegraRon.
Ontologyusage:GWASCatalogandPubChemannotaRons
Classesinthehigherlayersoftheontological
hierarchyareveryabstractandareneverusedfor
dataannotaRons.
Commonconsensusacross12differentdatasources
inLinkedOpenDatacloud,whereChEBIontologyis
usedfordataintegraRon.
DoBioPortalWebUIexploraRonandAPI
queryingstrategiesinformontologyusage?
• MinimalcorrelaRonandsimilariResbetweenexploraRonand
queryingstrategies,andusage(reuseanddataannotaRons)
• Ontologyreuseoccursfromclasseslocatedinthehigherlayersof
theontologicalhierarchyforsemanRcinteroperabilitybetween
ontologies.
• However,theseclassesareveryabstractandnotveryrelevantfordataannotaRonordataintegraRon,whereclassesinthe
lowerlayersofhierarchyarehighlyused.
– Theseclassesarerarelyexploredorqueried,weneedbeGerinterfaces!
NovelresearchdirecRonsmaybeenabled
throughtheBiOnICandVisIOnresources
• CategorizeuserbrowsingbehaviorsbyincorporaRngthestructuralfeaturesoftheontologyclasses.
• DeveloppersonalizeduserinterfacesforontologynavigaRon,withpredicRonsofthenextclassthatauserislikelyto
access,ortargetedrecommendaRonsbasedonusertype
(recurrentneuralnetworks,collaboraRvefiltering,…).
• DevelopadvancedmethodsforontologysummarizaRonand
modularizaRon,usingBiOnICdatasetsasfeatures.
…
Acknowledgments
MusenLab,Stanford
BMIPhDProgram,Stanford
USNIHGrants
U54-HG004028
GM086587
[email protected]://onto-apps.stanford.edu/bionic
hGp://onto-apps.stanford.edu/vision