designing a semantic web path to e-science
TRANSCRIPT
Francesca Di Donato, Dipartimento di scienze della politica, Università di Pisa
Designing a Semantic Webpath to e-Science
SWAP 2005 - Semantic Web Applications and Perspectives2nd Italian Semantic Web Workshop - Trento, Faculty of Economics, 14-15-16December, 2005
Premises
Problems to solve…
Look! Shethinkshypertextual!!
Yes… Butshe still publishes texts!
Science and the Web.Two models of selection
Gatekeepers
Author Author
Reader
Reader
Reader
Reader
Selection and the Semantic Web
How can Ifind what I’mlooking for?
• Selection: to findwhat you arelooking for
Selection is user-driven
A necessary condition to select information
is accessing information...
Conditions
Open Access to Scientific Knowledge: OAI
• 1991: ArXiv
• 1999: Santa FeConvention
• 2001: OAI-PMH(Protocol forMetadataHarvesting)
2005: More than 250 repositories are connected through OAI-PMH
Open Access to Scientific Knowledge:policies
BERLIN 3:March 2005:
Agreed Recommendation:
"In order to implement the BerlinDeclaration institutions shouldimplement a policy to:
1. require their researchers todeposit a copy of all theirpublished articles in an openaccess repositoryand2. encourage their researchers topublish their research articles inopen access journals where asuitable journal exists (and providethe support to enable that tohappen)."
BerlinDeclaration
Signature:
Open Access to Scientific Knowledge
: Numbers
• More than 150 institutions signed the Berlin Declaration• More than 70 out of 77 Italian Rectors are among them• More than 270 repositories OAI-PMH compliant• More than 1800 open access journals
A sheer amount of data and metadataand
A subscribed pact on what to do with themare available
A possible path to e-Science
1. Extracting Hidden Semantics
Dr. Peter Murray-Rust
Towards a Chemical Semantic Web:
Examples that a robot could do: * Find: published molecules that obey "druggability" criteria
* reactions that create carbon-halogen bonds * phase diagrams for lipid mixtures
...more ambitious...
* read J.Med.Chem and compute geometries and energies for all new molecules * calculate binding to HIV protease
* order the chemicals required to synthesise them and check safety * synthesise and test them
Semantic Web demos:
* WWMM: an Open non-centralized, peer-to-peer collection of molecules and properties)
* Understanding data in "free-text" (OSCAR)* NesC:
2. Combining multiple quality criteria
The importance of usage information!!!!
•Recorded in the present (usage), not 3-4 years after fact (citation)
•Unlimited access, unlimited sample size
•Already recorded locally at many different information resources
•Reduced “social desirability bias”
•Recorded at all stages of the scholarly process
•Applies to all units of scholarly communication
J. Bollen, H. V. de Sompel, J. Smith, and R. Luce. Toward alternativemetrics of journal impact: a comparison of download and citation data.Information Processing and Management, 41(6):1419-1440, 2005.Prof. Johan Bollen
1) When a user downloads A and B, A and B may be related.2) The co-download frequency corresponds to degree of relatedness (all docs)
Clickstream/data mining approach
1) When an author cites B from A, A and B are related2) The frequency of citation corresponds to degree of relatedness (journals)
Citation
3. OA + Semantic Web:one example
Other measures forResearch impact:
- Degree centrality: the sum of the number ofrelationships pointing toand from an actor, i.e., their in- and out-degree,normalized by the totalnumber of relationships in the social network
- Closeness centrality: the average shortest pathdistance of an actor to all other actors in the network.
- Betweenness centrality: the frequency by which anactor is part of theshortest path between any pair of agents in thenetwork.
Why not to store metadata in RDF?
Then it’s easy to carry out
Bibliometric computes, such as..
HyperJournal team