![Page 1: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/1.jpg)
Providing useful perspectives onto massive digital collections
Mark GaheganTawan Banchuen, Will Smart, Brandon Whitehead
Centre for eResearchUniversity of Auckland, New Zealand
BeSTGRID
![Page 2: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/2.jpg)
There are 2 rules for success in life:1. Never share everything you know2.
There is always missing information/knowledge…
![Page 3: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/3.jpg)
Vannevar Bush, As We May Think (1945)
“There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers - conclusions which he cannot find time to grasp, much less to remember, as they appear.
Professionally our methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose…”
“…A record, if it is to be useful to science, must be continuously extended, it must be stored, and above all it must be consulted.”
![Page 4: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/4.jpg)
Sarah E. Fratesi, 2008Journal of Research PracticeVolume 4, Issue 1, Article M1,Scientific Journals as Fossil Traces of Sweeping Change in the Structure and Practice of Modern Geology
The knowledge explosion: Geological Research
![Page 5: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/5.jpg)
The Pain
• Digital collections are growing at a geometric rate…• How much useful information is never used because
it cannot be found?• How much time & money is lost trying to interpret
(or re-interpret) acquired data?• How much effort & uncertainty is involved in trying
to understand the work of others?
![Page 6: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/6.jpg)
Our systems compartmentalise our understanding: this is a very bad thing
Our systems:– Databases– Analysis & simulation tools– Document repositories (e.g. articles, books, theses)– Ontologies– Visualisations– Organisation charts– Calendars– Wikis / Blogs / email
Meaning resides across ALL of these places—but it is difficult to extract and connect
![Page 7: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/7.jpg)
Questions we might ask Technologies to support these questions
Created by person X, at time T, etc. Good metadata
Described by (registered to) these semantic tags:
Ontology registration / alignment service
Was created in this way: Representation of workflows
Plays a role in these workflow(s): Search through workflow instances
Has been used to fulfill these task(s): Activity logging at the Portal
Has been used by these people/groups: Social networks, organisation charts, activity logging
Has been used by researchers with the following interests:
Semantic associations & inference
Is referenced in these publications Integration with digital libraries
Is most often used with these method(s) Data mining, recommender systems
Has received the following reviews User feedback / ranking
Is similar to, or differs from, dataset B in the following way(s):
Difference metrics for resources
![Page 8: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/8.jpg)
All the resources in the collections housed in the GEON cyber-infrastructure
![Page 9: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/9.jpg)
Things to bear in mind (drivers)…
1. The sheer volume of resources in digital collections.2. The dynamic nature of the collection catalogs in an e-
Infrastructure.3. The need to support multiple search strategies to
find useful resources.4. The need to help explain what resources mean, or to
contextualise them in some way, so they can be used appropriately
5. The need to capture new connections and evolving understanding.
![Page 10: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/10.jpg)
A knowledge gateway to an eResearch community
Examples from geoscience
![Page 11: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/11.jpg)
Conceptual Universe of GEON: now organised by themes
![Page 12: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/12.jpg)
Navigating through the themes
![Page 13: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/13.jpg)
GEON: Institutions, Personnel, PIs, Co-PIs, grad students
![Page 14: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/14.jpg)
What has person A contributed?(Kai Lin: GEON researcher)
![Page 15: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/15.jpg)
Complete conceptual neighbourhood of a resource (an article in this case)
![Page 16: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/16.jpg)
Perspectives as filters
Perspectives filter an information space according to particular situations. Perspectives A and B preferentially select different types of resources and relations; the ability to view perspectives can show how someone else made sense of a given set of resources.
![Page 17: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/17.jpg)
Who used a particular article? (its user community)
![Page 18: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/18.jpg)
Multiple perspectives:What did A create that B used?
![Page 19: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/19.jpg)
Perspectives (1) as SPARQL QueriesConsider the following SPARQL query:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX theme: <http://www.geovista.psu.edu/cV/themes.owl#>SELECT ?xWHERE (?x rdf:type rdfs:Class?x rdfs:subClassof theme:Geosciences)
Here “rdf” and “rdfs” are namespace prefixes for W3C’s RDF and RDFS languages, and “theme” is the namespace prefix for an ontology that defines subjects of interest within the geosciences. Executing this query on the ontology will return all the resources that are sub-classes of the resource “theme:Geosciences”.
Precise definition of a global perspective based on the query is as follows. Let T denote the original ontology (which is a set of RDF statements in the form of [subject, predicate, object]), Q denotes an RDF query, S(Q) denotes the set of subject constants in Q (empty in the above example, since all subjects are variables), P(Q) denotes the set of predicate constants in Q (including “rdf:type” and “rdfs:subClassOf” in the example), and O(Q) denotes the set of object constants in Q (including “rdf:Class” and “theme:Geosciences” in the example). If we use R to denote the set of resources returned by the query, then a global perspective is a set of RDF statement, denoted by PS, which satisfy the following conditions.For any statement [subject, predicate, object] in PS: if predicate P(Q) then subject R S(Q) and object R O(Q);∈ ∈ ∪ ∈ ∪if predicate P(Q) then subject R and object R;∉ ∈ ∈[subject, predicate, object] T∈
![Page 20: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/20.jpg)
![Page 21: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/21.jpg)
Perspectives (2)truncated concepts become properties
11
1
6
4
5
10
3 9 15
12
14
16
13
82
7
A
B
11
1
6
4
5
10
39
15
12 14
16
13
82
7
11
1
6
4
5
103 9
15
12
14
16
13
82
7
![Page 22: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/22.jpg)
Perspectives (3): different facets of a concept are ‘externalised’
image
user A user Cuser B
PropertiesDate: ddmmyyyyScale: 1:xxxxxxxxCountry: “………..”Content: (m, n, o, p)
image
country content
PropertiesDate: ddmmyyyyUser: (A, B, C)Scale: 1:xxxxxxxx
m
p
onFold in
Fold out
![Page 23: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/23.jpg)
Which topics are closely related?
![Page 24: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/24.jpg)
Authors folded into themes,
Themes connected together by author properties
![Page 25: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/25.jpg)
Intersecting research interests of a science community
![Page 26: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/26.jpg)
Conclusions
• Organising resources according to different knowledge facets holds considerable promise– We may not be able to capture what things mean directly, but we can
provide some signifiers (clues)– Usage data can provide very strong signifiers
• Intuitive navigation metaphors are needed– Strong filters are needed—perspectives—to avoid overcrowding &
confusion,– And to better match the users’ conceptual models
• We need strong identifiers for digital resources, so we can find references to them from many systems– This facilitates value-added services (semantic web-search, maps, use
cases, provenance, text descriptions, markup tools, etc)
![Page 27: Providing useful perspectives onto massive digital collections](https://reader035.vdocuments.net/reader035/viewer/2022062410/568162b1550346895dd338ed/html5/thumbnails/27.jpg)
END