on our way to to information overload ?
DESCRIPTION
On our way to to Information Overload ?. Or to prevent it by Appropriate use of Technology ?. C19881 0.99 C92992 0.67 C02002 0.66 C99229 0.44 C00392 0.33 C93939 0.21. Collexis Fingerprints (CFP’s). consolidated knowledge. Cross-language networking. - PowerPoint PPT PresentationTRANSCRIPT
On our way toto
Information Overload ?
Or to prevent it by Or to prevent it by Appropriate use of Technology ?Appropriate use of Technology ?
C19881 0.99C92992 0.67C02002 0.66C99229 0.44C00392 0.33C93939 0.21
consolidated knowledge
Collexis Fingerprints (CFP’s)
English
French
Spanish
Peoplemedical researchersaround the world
Activitiesin elect. text like projects, publicationsMedline abstracts...
Disease: #12674
MultilingualThesaurus IndexerMatches keywords, translatesthem to identical numbers and ranks them by their relevance
Maladie: #12674
Enfermedad: #12674
Malaria: #24530
Hospital: #19994
Paludisme: #24530
Paludismo: #24530
Hôpital : #19994
Hospital: #19994
...
...
...
The CommonLanguageEach activity is representedas a set of keyword numbersranked by their relevance
#4256 : 1.0#3627 : 0.8#19994 : 0.5#28746 : 0.3#32874 : 0.1#32874 : 0.1#32874 : 0.1
#14325 : 1.0#3627 : 0.8#19994 : 0.5#28746 : 0.3#32874 : 0.1#32874 : 0.1#32874 : 0.1
#85643 : 1.0#3627 : 0.8#19994 : 0.5#28746 : 0.3#32874 : 0.1#32874 : 0.1#32874 : 0.1
#17345 : 1.0#3627 : 0.8#19994 : 0.5#28746 : 0.3#32874 : 0.1#1c8456 : 0.1#00356 : 0.1
„Collexion“ of activities
You:
#17345:1.0#3627 :0.8#19994:0.5#28746:0.3#32874:0.1
Your activity as text
Submit and indexed to keyword numbers
Find similaractivities andthe peoplebehind
Cross-language networking
The Early evolution of Fingerprint Manipulation
contents fingerprints
addadd
people fingerprints
addadd
organization fingerprint
JobsCV’s, Skills
Articles,books
Emails,Word RFP’s
BIOSEMANTICS• “Cellese”: the language that cells use to communicate
internally and externally.
• The Molecular Language and its biological MEANING• The Group
– Jan Kors PhD.– Erik van Mulligen PhD– Bob Schijvenaars PhD– Marc Weeber PhD– Christiaan v.d. Eyck MsC– Rob Jelier PhD – Barend Mons PhD– Johan van der Lei PhD
SERENDIP
Beyond PublicationBeyond PublicationSemantic metaSemantic meta--analysis of massive data and information sources for discoveryanalysis of massive data and information sources for discovery
Bsik 2003Bsik 2003
A consortium to combine State-of-the-art Information and Knowledge Mining Technologies
To support:
•Thesaurus and ontology enrichment
•Disambiguation of concepts
•Semantic meta-analysis of massive information
To enable:
•Information-based discovery
•Evidence based policy making
Thesaurus and Ontology Enrichment
• New concepts• Synonyms• Homonyms• Genes, Proteins • Pictures
Valida
tion 3
Freetext
UnexplainedText (XML)
Potential concepts
Thesauri:•Mesh•HUGO•SwissProt•SAGE•Others
FUA
4
1Fingerprints(known concepts)
partners
E-BioSci
EMBOElsevier
NLP
2
TNO
LUMC
HUGONC
Genebio
AMC
EUR
UVA
SERENDIP
Too much to read: major trends foreseen:
• From Reading to Consulting• From Reading to Meta-analysis• From Text to Knowledge
Representations
C19881 0.99C92992 0.67C02002 0.66C99229 0.44C00392 0.33C93939 0.21
Semantic typesSemantic typesCo-occurrence dataCo-occurrence data
The first step: to the Conceptual Semantic Network
Calcium deposition Pleocytosis Basal Ganglia EncephalopathyCerebrospinal Fluid Tomography, X-Ray Computed Parents FamilyAicardi Goutieres syndrome Ferrocalcinotic deposition Spastic quadraplegia Fahr disease Microcephaly AGS1
xG-protein coupled receptors G-substrate Lipoid dermatoarthritis Receptors Complement Factor B RNA, Complementary Xenopus oocyte AGS1
SwissProt: Activator of G-protein signaling 1 (AGS1)
*225750
AICARDI-GOUTIERES SYNDROME 1; (AGS1) : OMIM
Aicardi Goutieres syndrome 1Heterogeneity Linkage (Genetics) Clinical diagnosis Family 2 AGS1 **Lod Score Genetic Heterogeneity analysis Toxoplasmosis Calcium deposition 3 Encephalopathy 4 Cadmium Genus: Human cytomegalovir... Cerebrospinal fluid abnorm. 5.. Interferon-alpha Chromosomes Viral Child Head Tricuspid Valve Stenosis
Fingerprinting
disambiguatio
n
ACS
META-ANALYSIS
Applications
• Cross-language, jargon and cross-system matching (implemented): www.sharingpoint.shared-global.org
• Information-based discovery (Research)
• Community building (Experts,Policy Making)
• Trendwatching and Indicators (Policy Making)
Seed-Term based Conceptual Semantic Networks
??
Clustering of genes on-the-fly
Predicting new knowledge ?
III= Distribution over distance categories of concept-pairs without co-occurrence in the learning set.
IV= Distance categories of concept pairs related to the probability that there is no explicit relationship or co-occurrence in Medline (zero ratio) . A ratio of 0 means that an automatic Query in Medline with the concept pair with “AND” in between does lead to 0 hits in Medline.
New Drug discovery ?
Semantic Filtering
Knowledge Maps, Nature Biotechnology Map
Knowledge Maps: Medline Bioterrorism Map 1997
Knowledge Maps: Medline Bioterrorism Map 2001
Private Research
DC
Public
E-BioSciPharma etc.
ORIELSERENDIPFP6 etc.
I-ResearchMinistiesWHO, FAOetc.
SHAREDBIREME/VHLEDCTPOxford intiative etc.