mending the gap between library's electronic and print collections in ils and library's...

85
1 Mending the Gap between Library’s Electronic and Print Collections in ILS and Library’s Web Site using Semantic Web - Progress Report 2007 EndUser Annual Ex Libris User Group Meeting April 28, 2006 Chicago, ILL Amanda Xu, Electronic Resources Cataloging Librarian, and Andrew Sankowsi, Director of Collection and Information Management St. John’s University Library Jamaica, New York

Upload: new-york-university

Post on 20-Jul-2015

184 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

1

Mending the Gap between Library’s Electronic and Print Collections in ILS and Library’s Web

Site using Semantic Web - Progress Report

2007 EndUserAnnual Ex Libris User Group Meeting

April 28, 2006Chicago, ILL

Amanda Xu, Electronic Resources Cataloging Librarian, and Andrew Sankowsi, Director of Collection and

Information Management St. John’s University Library

Jamaica, New York

Page 2: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

2

Content

Technology

Distribution

Q: Where is a library’s value proposition as illustrated in the triangle below? Is library an aggregator of aggregators? If so, are we ready for it? If not, what is the our next social identity? How can we transfer our skills from ‘info-land’ or ‘meta-land’ to ‘digital land’ or ‘semantic land’ or simply hybrid ‘land of bits?’ where print and electronic co-exists?

Where is users’behavior context?

Aggregated contents through technologies?

Capable to select, organize, access, guide, enhance, and distribute contents to the user through technology. Still, complains like ‘why what you buy is not what I need, and what I need is not what you buy?’

Page 3: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

3

Introduction• At this Thursday’s keynote presentation, Oren Beit-Arie –

Chief Strategy Officer of Ex Libris defined the role of library as ‘connecting users to content’, and providing ‘unique services, to tailor the needs of their users and integrate service into users’ tasks and workflow.’

• This is exactly where we feel the game is about as well –modeling user’s information seeking behavior in context of their experience, use the information to improve our collections and services.

• Last year, we proposed a sample conceptual model to:– Identify the information need of our faculty through the

footstep of their teaching and research experience; – Use aggregated information to measure how well our

collections will meet faculty’s teaching, learning, and research requirement through the footstep of collection managers;

– The subject area that we chose was math and computer science

Page 4: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

4

Page 5: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

5

Major Challenges and OpportunitiesChallenges1. Much of the data sources still

residing in isolated data islands in closed systems or flat file systems

2. Integration requires enterprise level

3. Lack of resources, times, money, and understanding of required systems infrastructure – hardware, software, database, network messaging, data structure, etc. throughout the lifecycle development of the application;

4. Lack of resources and required skills to build the repository server, handle ETL process and messaging among systems across the enterprise;

5. Hard to get people buy-in• Priority conflict• No access to data sources• Turf protection• No agreement on business

model – SAAS• No governance

Opportunities1. Model resource discovery

process cross-databases2. Identify gaps in existing

systems infrastructure as far as content selection is concerned;

3. Re-examine collection development and information management process

4. Develop survey forms, interview people in charge of functional areas

5. Plan to do collection analysis via inventory, gaps, usage, cost

6. Tie the analysis to researcher’s need

7. Identify new data sources to be managed by libraries from institutional repository to ILS operation;

8. Identify priority list - Mend the gap between print and electronic collections in ILS and Library Site

Page 6: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

6

1. Environmental Scan of Web Content Technologies by IT Vendors, byLibrary IT Vendors and Academic Libraries;

2. Current Web and Print Resources Integration Effort at St. John’s University Library - Voyager, Gary Strawn’s Location Changer, Serials Solutions, OCLC World Cat, Google Scholar, Net Library, NYLINK, WALDO, Content Aggregators, Institutional Portals, Courseware, Faculty Pub, Student & Alumni Repositories;

3. Swam through the whole iterative processes of application development from backend to front end for the project:

Gather requirement

Obtain buy-in from IT vendors, IT consulting services, and make recommendation

Obtain support from upper and middle level managers

Get into training at NYU SCPS, and sharpen needed skills set forcommunication with technical and non-technical people

Proposed technical infrastructure from SOA framework to databasesystems, data structure, etc.

Recommend a little touch of Semantic Web technologies for Pervasive Library Resource Management on the Websites

Steps Taken

Page 7: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

7

The credit of the talk shall go to many giants in IT industries, library IT industries, especially former Endeavor and later Ex LibrisInformation Systems, End User 2007 Program Planning Committee, St. John’s University Library, NYU SCPS faculty;

What Will Not Be Discussed for This Talk:Detailed files, data repositories, networks (physical and logistics), distributed application programming and computing services, security, jobs and events required for content aggregation and deployment on library’s websites;Mathematical models for auto-text processing, patterns, and business rules generated and deployed by any given semantic Web application;Stacks of standards concerning key content technologies

Credit and Exceptions

Page 8: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

8

Review Web Content Technologies By IT Vendors(1)

Key Content Technology Vendors Investigated:1. Project Management, Enterprise Architecture, Modeling

– IBM Rational, iRise, Telelogic, Agilense, CA Erwin, MS Visio, MS Project Server, Embarcadero, Enterprise Elements

2. Content Capturing Vendors – Captiva Software, Adobe, FUJI, ZyLab, ABBYY FlexiCapture, Liquid Office;

3. ETL and Master Data Management: Sunosis, Data Flux, Kalido

4. Content Management Systems – Oracle Stellent, EMC Documentum, ArborText, POET, XEnterprise, Marklogic;

5. Search Engine Services - Verity K2, Autonomy, Teragram, FAST, Inxight Software, Endeca, iProspect, IBM OmniFind, Siderean, Semantic Works, Google, Ask;

Page 9: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

9

Review Web Content Technologies By IT Vendors(2)

6. Portal – Vignette, Hummingbird, MS Share Point and IBM Websphere;

7. BPM/SOA – Global 360, IBM FileNet, Lombardi, PegaSystems, Hyperion, Pervasive, SUN/SeeBeyond, BEA Systems, TIBCO, Sonic ESB;

8. CRM (Customer Relationship Management Vendors)–Answers, Oracle Seibel;

9. Business Intelligence and Reporting – Business Objects, Crystal Reports, Oracle, Informatica, Accenture, SAP, SAS, SPSS;

10.Service Resolution Management – Knova and Kana;11.Semantic Technologies: IBM, Oracle, Jena2, Gate,

Siderian;12.Best Practice Sources: Delphi, KM, TDWI, AIIM, DCI,

OMG, SOA/BPM Institute, Insight Shared Network, SD Time, Project 10X, NCOR, Getty, DocuLab, Ken Orr Institute, Zachman Institute, Essential Strategies, TOPQuadrant, Semantic Arts, Forrester, etc.

Page 10: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

10

Check Functional Components and Enterprise Level Content Sources to Be Leveraged in the Framework of Service Oriented Architecture (SOA) (1)

1. Project Management, Enterprise Architecture & Modeling

2. Imaging and Document Capture3. Web Content Management from 2.0 to 3.0, including

content created from portal, desktop application, browser, e-form, and other web-based collaboration environment, such as Wiki, Flickr, instant messaging, Yahoo 360 & Food Site, Oracle OTN Site, MySpace, blog, RSS, social tagging, recommend, etc.

4. Document Management 5. Record and Retention Management 6. Digital Asset Management7. ECM (Electronic Content Management) - Taxonomy,

Thesauri, Topic Map, Meta-data

Page 11: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

11

Check Functional Components and Enterprise Level Content Sources to Be Leveraged in the Framework of Service Oriented Architecture (SOA) (2)

8. Enterprise Search, Directory, Digital Signature, Auto-Classification, Clustering, Categorization, Security, Risk Management

9. Compliance to License, Auditing, Federal and Legal Regulations

10. Information Reusability, Lifecycle and Retention Policy11. Data Warehouse, Business Intelligence, Performance

Management and Monitoring12. Business Process Management (BPM) 13. Semantic Web Technologies14. Email Management15. Portal

Page 12: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

12

Preliminary Proposal for Required Data Architecture in SOA Framework (1)

Source Data

ODS Systems

DataStaging

Areas

Data Warehouse

Data Marts By Service Dept By Media Type By Profile DataBy Discipline

By Grain By Contract

Analytical Data Mart

Ad Hoc Query

Modeling & Mining

ToolsVisualization

ToolsRule EngineETL/BI/Ontology

Meta-data Unstructured

Data

Meta-data Structured

Data

Vocabulary/Lexicon/Concept

PresentationEnterprise portals Collaborate Discover SelectAnnotateEnhanceSearchNavigateSyndicateInterchangeOntology-assisted

Transformation

Reference Master

Data

RDF/OWL Data Ontology

ProfileProtocol

ModelSpec

StandardSchema Contract

ConstraintMedia type

Linkage

Page 13: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

13

Preliminary Proposal for Required Data Architecture in SOA Framework (2)

Discovery agencies

Service Providers

Service Consumers

FindPublish

Interact

Capable

NeedSatisfactionRequirement

has

Service Description

ProtocolsStandards

SpecsPoliciesLimits

GovernanceContacts

ReuseInteroperableVisibilityExecution contextEffect

StrategiesPatternsModelsProfiles

Domains

Refer Hold

Contract & policiesService

DistributionContent

Technology

specify

Info, ProcessAction

BehaviorModel

has

feedback

Page 14: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

14

Compare What Offered in Content Technologies by Library IT vendors (1)

Check Functional Components and Library-wide Content Sources Supported By Library IT Industries

Integrated Library Systems (ILS)– Print and Electronic Resources

Electronic Resources Management Systems (ERMS) for Subscribed Titles in Electronic Databases – Full text and A&I

Full-text A-Z List by Directory and Subject on Library Web

Federated Search, Google-like, etc. search on Library Web

Link Resolver and Knowledgebase containing logical links and holdings for print and electronic materials

Digitized Documents and Images

Library Web Contents 1.0, 2.0, & 3.0, including stream videos, library Wiki, eForms, instant messaging, RSS, eTutorials, eNews, eAlerts, etc.

Page 15: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

15

Compare What Offered in Content Technologies by Library IT vendors (2)

Interlibrary Loan Services (ILL) eReserveeReferences – Ask LibrarianAuto-citation integration – RefWork, Endnote, etc.Record Management for Institution and Archival Contents, e.g. EAD and TEI Library portals as library content and service distribution toolkit, e.g. WorldCat, Google Scholar, etc.Integrated support in context of service request:

uPortalLearning Management Systems (LMS) and courseware, e.g. WebCTuSearch, uMeta-data, uTaxonomy, uEmail ManagementuReporsitory, eg. DSpace, Fedora, Sakai; Domain specific repositories, e.g. PMCCommunity-based repository, e.g. EI Village, Community of Science, MySpace

Statistics for inventory, budget, cost, user behavior, usage, etc.

Page 16: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

16

Current Status Check (1)

1. Current Web and Print Resources Integration Effort at St. John’s University Library –

Ex Libris - Voyager as ASP Solutions for ILSSerials Solutions – SaaS Solutions for E-J Management

Maintain singe version of the truth of E-J holdings and subscriptions via SS Knowledgebase and Clients;Output – E-J A-Z list at journal level:

on the library website in HTML format in EZ Proxy server as monthly updates in XML format in Voyager as MARC title list in MARC formatin article linking to OCLC World Cat, Google Scholar, NetLibrary, and content providers at article-level in central search at package level if connectors with content providers are readily available (in progress)

Use Gary Strawn’s Location Changer, and MARCEdit for monthly updates MARC title list in Voyager and data consistence checking among the above lists of services

Separate workflow process and platform for E-Content Packages listed as A-Z list by database name, and by subject ; yet same content packages provided by the same vendor;

Page 17: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

17

Current Status Check (2)3. Still look for:

Electronic Resources Management Systems (ERMS), e.g. Serials Solutions, TDNet, Meridian, Verde;

Digital Resources Management Systems (DRMS)–ContentDM, Greenstone, Encompass, etc.;

Institutional Repository Archives, e.g. DSpace, Sakai, Fedora, etc.;

Library Portals to uPortal Courseware, e.g. Blackboard, WebCT;

4. Implemented SaaS solutions to citation management –RefWork; EReserve – Docutek

5. Campus IT handles Institutional Portals, Courseware, Faculty Pub, Student & Alumni Repositories in collaboration with the Library;

Current Status Check (2)

Page 18: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

18

Obtain Journal Title Holdings from OPAC and Journal A-Z List to Content Providers

Page 19: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

19

Obtain the Journal Issue

Page 20: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

20

Obtain – Journal Article

Page 21: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

21

Obtain by Subject – Two Terms in one Search

Page 22: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

22

PubMed: Ear Wax Removal (1)

Page 23: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

23

PubMed: Ear Wax removal(2)

Page 24: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

24

EBSCO: Cerumen

Page 25: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

25

Scorpus: Earwax

Page 26: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

26

Gale Group: Removal of Cerumen

Page 27: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

27

Gale Group: Removal of Cerumen

Page 28: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

28

Gale Group: Removal of Cerumen

Page 29: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

29

Obtain by Subject - LCSH Search: Ear Wax

Page 30: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

30

LCSH Authority File: Earwax

Page 31: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

31

Obtain by Subject Default to Broader Term - LC Catalog: Earwax

Page 32: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

32

Obtain by Subject Default to Broader Term - LC Catalog: Earwax

Page 33: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

33

Obtain by Subject: Wikipedia: Earwax

Page 34: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

34

Obtain by Subject: Wikipedia: Earwax

Page 35: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

35

Obtain By Subject – Two Terms in Two Searches in Ask

Page 36: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

36

Obtain By Subject – Two Terms in Two Searches in Ask

Page 37: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

37

Obtain by Subject: Two Terms – Two Searches in Yahoo

Page 38: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

38

Obtain By Subject – Two Terms in Two Searches in Yahoo

Page 39: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

39

Obtain by Subject: Two Terms – Two Searches in Google

Page 40: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

40

Obtain By Subject – Two Terms in Two Searches in Google

Page 41: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

41

Computer Simulated Model to Draw This Chart – How Many Data Store Do we Need, and How Many Interfaces Do we Need to Create for the End User?

Napoleon’s March to Moscow – The War of 1812 Edward Tuffe – Poster from Envisioning Information

Page 42: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

42

Current State of Web Content Packaging using Integrated Library Systems, Electronic and Digital Resource Management Systems in Comparison with What Offered by Aggregators, Google, Ask_Jeeves, etc. (2)

At presentation layer, systems that support open URL allows user to traverse from database to journal, and from journal to article independent of the location of the services. We have more chances to ensure ‘find, access, and obtain’; while search engines may provoke copyright and license barrier;

At document processing level, PubMed, LC, EBSCO, SCORPUS, Gale Group use authority control for subject access, while Google, Ask_Jeeves, Yahoo do not. We anticipate user’s query by adding authority control for named entity and controlled vocabulary for subject access, e.g. two search terms only need to be entered once;

At query level, there are lot of rooms for us to improve – query expansion, e.g. teaser, refinement, and optimization, etc.;

At end user level, we still do not know them at individual level;

At process management level and performance measure level, we are still in ground 0. At content data model level - Data Store vs. Interface

How Many Interfaces Do we Need? – ERMS - eJournals, Federated Search -Articles, Library Web Site –Databases, DCMS –Images?

Page 43: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

43

Desired Features for Managing Library Print and Electronic Content on library website(1)

Need Another ECMS Or Wrapper Or Data Warehouse?

Function - merge dataEssential elements for a journal record in Serials Solutions, and Library Catalog have different requirements. Yet, they all need core elements for identification, discovery, and dis-ambiguous purpose;

How many times do we have to create them or export and import them into these repositories?

Page 44: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

44

Desired Features for Managing Library Print and Electronic Content on library website (2)

Library Content Packaging Process:Data extraction, transformation, and load (ETL) is still manual-oriented process, e.g. loading MARC data file into ACQ, ACQ into Meridian, LinkFinderPlus into Federated Search;If we want to maintain one version of truth of our data for ILS, ERMS, DCM, Federated Search, and Dspace, shouldn’t it be - extracted, loaded, transformed (ELT), and designed in a way that they can be modularized, reusable, and portable everywhere;

Constant tagging standard for Web content at taxonomy level among ILS, ERMS, DCMS, Federated Search, and Dspace:

Taxonomy for DCMS – container specific, or across ILS, ERMS, Federated Search, and uPortal?

Type of Content Unwrapped:Form processing, how do we capture form data on our web, or in print, excel, PDF format?Digital Asset;Web content from Web 2.0

Content Redundancy among ILS, ERMS, DCMS, Federated Search, and institution repository:

If all we can get from ERMS is 1) license compliance, and 2) analytic reports from data warehouse, shouldn’t we add the license info to Voyage ACQ, and build data warehouse on top of all repositories – ILS, DCM, Federated Search, library website, WorldCat, uPortal, DSpace, etc.?

Page 45: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

45

Review of Desired Features for Library Electronic Content Management Systems (ECMS) (3)

Content Data Model

Support mission critical reports, e.g. 360 degree view of workflow process for journals?

Collection level record for hierarchical invoice processing of a subscription package with hundreds of titles in one bundle; Price history for periodicals and package should be allowed to exist in ACQ and enable price comparison at journal title level;

Support sufficient business rules for content validation, e.g. validation rule against duplicated invoice, etc.

Consistent Content Retention Policy:ILS – MFHD has retention policy but not enforced; An item gets withdrawn from item level; What about content in ERMS, DCM, library Website, and how should the out of date, inaccurate data be systematically removed? Can content retention policy be enforced so that record removal or changes of locations have options to setup systematically?

Content display model:Facet browsing and search support; Auto fix of broken URLs and Web content change;Horizontal content display model, e.g. ledger info of various fiscal year

Meet compliance requirement

Page 46: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

46

Review of Desired Features for Library Electronic Content Management Systems (ECMS) (4)

– Search, navigation, retrieval, and display by description, classification, subjects from library catalog to library website, from library website to content providers, from journal to issue, from issue to articles;–What questions does it answer?? - Vertical and horizontal (views + processes + usage + ROI) from the perspective of end-users, librarians and staff, process owners, administrators, and partners (contents, technologies, and services)

USE

Page 47: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

47

Semantic Web Definitions

1. “A common framework that allows data to be shared and reused across application, enterprise, and community boundaries.”– Available: http://www.w3.org/2001/sw/

2. “An attempt to make Web resources more readily accessible to the automated processes by adding information about the resources that describe or provide Web content.”– Available: http://www.w3.org/2004/OWL

3. “Binary relationships capture the meaning of the link” – Tim BernersLee, Japan Prize 2002. – Available: http://www.w3.org/2002/Talks/04-sweb/

4. SW is an “extension of the current web, providing an infrastructure for the interchange and the integration of data on the Web.”– Available: http://www.w3c.org/Consortium/Offices/Presentations/RDFTutorial/

Page 48: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

48

Tim Berners Lee, “ W3C World Wide Web Consortium, Academic Discussion, Japan Prize 2002.”Available: http://www.w3.org/2002/Talks/04-sweb/slide12-0.html

Page 49: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

49

Semantic Technologies and Standards

Semantic Web Road Map by Tim Berbers-Lee, Sept. 1998. Available: http://www.w3.org/designIssues/Semantic.html

1. “A web of data, in some way like a global database”

2. “Machine understandable information”3. “Basic assertion model”

-meta-data: property of a resource in RDF Syntax

4. “Semantic layer”– RDF schema, FOAF, SKOS– OWL Lite, OWL DL, OWL Full

5. “Conversion of language” –‘semantically link two independent databases, and allow the query of each other via conversion of the query’

-

6. “Logic layer” – “deduction of one type of document from a document of another type, checking of a document against a set of rules of self consistency, resolution of a query by conversion from terms unknown into the terms known”• SWRL: a semantic Web Rule Language

combining OWL and RuleML7. “Proof validation” – a language for proof8. “Evolution rules language”9. “Query language” – SPARQL query

language for RDF10. “Digital signature” – “public key

cryptoography”, or “adding logic of trust as icing on the cake of a reasoning systems”

11. “Index terms” – RDF search engines12. “Engine of the future” – combine a

reasoning engine with a search engine

Page 50: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

50

Promises of the Semantic Web (1)

URI makes possible for everything, including partial information to be identifiable; If it is based on knowledge representation framework, SW will allow global consistency of data;Allows aggregation of information; Support inference of information; Extensible to multimedia data; Digital/Electronic library collections, institution and community collections are Web enabled;Combine applications remotely for local knowledge integration – calendar, address book, airline preferences;Encapsulate all data stores and processes behind the scene, and address users’ concerns in graphic, chart, etc. view

Page 51: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

51

<?xml version=“1.0”?>rdf:RDF xmlns:rdf=“http://www.w3c.org/1999/02/22-rdf-syntax-ns#”xmlns:ss=“http://yz3rj4vl2y.search.serialssolutions.com/serialsSolution/elements/1.0/#”><rdf:Descriptionrdf:about=“http://YZ3RJ4VL2Y.search.serialssolutions.com/?V=1.0&L=YZ3RJ4VL2Y&S=JCs&C=ALGEANDLOG&T=marc”><ss:JournalTitle>Algebra and logic</ss:JournalTitle><ss:JournalISSN rdf:parseType=“Resource”>0002-5232</ss:JournalISSN><ss:JournalCoverageDates>from 05/01/2003 to 1 year ago</ss:JournalCoverageDates><ss:Category>Algebra</ss:Category><ss:eJournalHomerdf:resource=“http://yz3rj4vl2y.search.serialssolutions.com”/><ss:contains rdf:parseType=“Literal”><h1>St. John’s Univ. Libraries e-full text Journals</h1></ss:contains></rdf:Desccription></rdf:RDF>

Promises of SW Layered Cake: Standards A simple RDF Example in RDF/XML (2)

Page 52: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

52

Promises of SW Layered Cake: Standards A simple RDF Example in RDF/XML (3)

• A resource is anything that can have a URI:'http://YZ3RJ4VL2Y.search.serialssolutions.com/?V=1.0&L=YZ3RJ4VL2Y&S=JCs&C=ALG

EANDLOG&T=marc’. Potentially all the elements of RDF/XML file can be addressed as URI, and thereby a distributed computer.

• A Property is a Resource that has a name and can be used as a property: e.g. <SS_JournalTitle>

• A statement consists of – Resource, property, and value. The three parts known as subject (s), predicate (p), and object (o), which are also known as a RDF Triple (s, p, o).

• RDF Graph defines methods to retrieve triples, property and object pair for a specific subject which is a resource, etc.

• Core property of RDF: rdf:ID – define a fragment identifier within the RDF portion, used in conjunction with xml:base; rdf:value; rdf:subject, rdf:object, rdf:rest, rdf:first, rdf:nodeID (internal identifier for a resource).

• Blank nodes with identical nodeID-s in different graphs are different.

Page 53: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

53

Promises of SW Layered Cake: Standards (4)A simple RDF Container Example in RDF Graph

#JournalTitle

#JournalIssn

#JournalCoverageDates

#eJournalHome

consistsOf

#Category

rdf:nil

rdf:List

rdf:fi

rst

rdf:first

rdf:first

rdf:first

rdf:rest

rdf:rest

rdf:rest

rdf:rest

rdf:type

rdf:typerdf:type

Page 54: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

54

Promises of SW Layered Cake: Standards (5)A simple RDF Container Example in RDF/XML

RDF class: rdf:List

<rdf:Description rdf:about=“#eJournalHome”><axsvg:consistsOf rdf:parserType=“Collection”>

<rdf:Description rdf:about=“#JournalTitle”/><rdf:Description rdf:about=“#JournalIssn”/> <rdf:Description

rdf:about=“#JournalCoverageDates”/><rdf:Description rdf:about=“#Category”/>

</axsvg:consistsOf></rdf:Description>

Page 55: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

55

RDF type: rdf:SeqRDF Properties rdf:_1, rdf:_2, etc. <rdf:Description rdf:about=“#eJournalHome”>

<axsvg:consistsOf><rdf:description><rdf:type rdf:resource=“http:// .. rdf-syntax-ns#Seq”/> <rdf:_1 rdf:resource=“#JournalTitle”/><rdf:_2 rdf:resource=“#JournalIssn”/> <rdf:_3 rdf:resource=“#JournalCoverageDates”/><rdf:_4 rdf:resource=“#Category”/></rdf:description>

</axsvg:consistsOf></rdf:Description>

Promises of SW Layered Cake: Standards(6)A simple RDF Container Examples in RDF/XML

Page 56: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

56

Promises of SW Layered Cake(7) A simple example of RDF Attribute using FOAF

Vocabulary in XHTML

<a href=mailto:[email protected]>email</a> or call me 718-990-6716 </p>…

Existing Web

… <p>If you have any question, please contact me:

Proposed Web

<html xmlns:foaf=“http://xmlns.com/foaf/0.1><head><title>Amanda Xu’s Home Page</title></head><body>…

<p>If you have any question, please contact me: <a rel=“foaf:mbox” href=mailto:[email protected]</a> or call <span property=“foaf:phone”>718-990-6716</span></p></body></html>

IT1

Page 57: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

Slide 56

IT1 "RDF/A Primer 1.0: Embedding RDF in XHTML," W3C Working Draft 10 March 2006. Available: <http://www.w3.org/TR/2006/WD-xhtml-rdfa-primer-20060310Information Technology, 4/18/2006

Page 58: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

57

Promises of the Layered Cake: Standards (8)A Simple RDF Vocabulary Description Language

/Schema in XML<?xml version=“1.0”?><rdf:RDF

xmlns:rdf=“http://www.w3.org/1999/02/22-ref-syntax-ns#”xmlns:rdfs=“http://www.w.org/2000/01/rdf-schema#”

xmlns:ss=“http://yz3rj4vl2y.search.serialssolutions.com/serialsSolution/elements/1.0/#”xmlns:xsd =“http://www.w3.org/2001/XMLSchema#”><rdfs:Class

rdf:about=“http://yz3rj4vl2y.search.serialssolutions.com/serialsSolution”><rdfs:subClassof rdf:resource=http://www.w3.org/200/01/rdf-

schema#Resource/></rdfs:Class><rdf:Property

rdf:about=“http://yz3rj4vl2y.search.serialssolutions.com/serialsSolution/elements/1.0/JournalTitle”>

<rdfs:domain rdf:resource=“http:// yz3rj4vl2y.search.serialssolutions.com/serialsSolution”/>

<rdfs:comment>No print holdings available for the title</rdfs:comment><rdfs:label xml:lang=“en”>JournalTitle</rdfs:label>

</rdf:Property> …</rdf:RDF>

Page 59: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

58

Promises of the Layered Cake: Standards (9)A Simple RDF Vocabulary Description Language

/Schema

• Core properties of RDF schema: – rdfs:subClassOf, – rdfs:seeAlso (another doc containing

additional information about the resources being described (t.o.c.),

– rdfs:member, rdfs:label, rdfs:subPropertyOf, rdfs:isDefinedBy, rdfs:Comment, rdfs:domain, rdfs:Range, rdfs:ContainerMembershipProperty

Page 60: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

59

#e-Journal Home

#JournalTitle rdf:type

rdfs:Resource

Rdfs:Class

rdfs:subSubC

lassOf

rdf:type

Nodes – rdfs:Resource, rdfs:ClassProperties – rdfs:subClasssOf, rdf:type

Promises of SW Layered Cake: Standards (10)A Simple RDF Vocabulary Description Language/Schema Graph

Page 61: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

60

Promises of SW Layered Cake: RDF/RDFS Standards and Technologies (11)

Binding RDF to an XML file

Use rdf:about asURI for external resourcesAdd RDF to XML directly in its own namespace

Technology Editor – DC.DOT, OCLC, IsaViz; Parser – ARP2, ICS-ForthScraper – GRDDL – microformat extraction out of XML filesSPARQLapplication –

SQL/SPARQL bridge – relational dbGRDDL for xml filesRDF filesRDFLib

HP Bristol labJena – full SPARQL implementationRDFstore(perl), RAP, SWI-Prolog

RDF/A extends HTMLExtends the link and meta elements

Page 62: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

61

Promises of SW Layered Cake: Standards (12)Web Ontology Language (OWL)

Dr. Leo Obrst, MITRE, 2006:“Ontologies are usually expressed in a logic-based language, enabling detailed, sound, meaningful distinctions to be made among classes, properties, & relations”;“More expressive meaning but maintain ‘computability.’”

SW expresses “ontological information about instances appearing in multiple documents linking of data from diverse sources in a principled way.” –W3C OWL Web Ontology Language Guide, 10 Feb. 2004Expressive, aggregation, link, inference – capability of OWL

““Ontology Spectrum and Semantic ModelsOntology Spectrum and Semantic Models””Dr. Leo Dr. Leo ObrstObrstMITRE MITRE Information Semantics GroupInformation Semantics GroupInformation Discovery & UnderstandingInformation Discovery & UnderstandingCenter for Innovative Computing & InformaticsCenter for Innovative Computing & InformaticsJanuary 12 & 19, 2006January 12 & 19, 2006http://ontolog.cim3.net/cgihttp://ontolog.cim3.net/cgi--bin/wiki.pl?ConferenceCall_2006_01_12bin/wiki.pl?ConferenceCall_2006_01_12 in in

http://ontolog.cim3.net/cgihttp://ontolog.cim3.net/cgi--bin/wiki.pl?WikiHomePagebin/wiki.pl?WikiHomePage

Page 63: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

62

Page 64: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

63

Page 65: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

64

Promises of SW Layered Cake: Standards (13)A Sample Web Ontology Language (OWL) in Graph

A dolphin is a mammal living in the sea or in the Amazon

From W3C Tutorial – www.w3.org/Consortium/Offices/Presentation/RDFTutoiral

Page 66: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

65

Promises of SW Layered Cake: Standards (14)A Sample Web Ontology Language (OWL) in XML

From: www.w3.org/Consortium/Offices/Presentations/RDFTutoiral#118

Page 67: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

66

Promises of SW Layered Cake: Standards (15)Web Ontology Language (OWL)Example of MARC 753 Serialized in RDF/OWL pt. 1

245 ##$a Decisions in economics and finance: A Journal of Applied Mathematics753 ##$a Applied mathematics 753 ##$a Mathematical models $b Social sciences 753 ##$a Mathematical models $b Economics 753 ##$d Social sciences $b Mathematical models

$s Mathematical models $t Social sciences753 ##$d Economics $b Mathematical models

$s Mathematical models $t Economics

Page 68: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

67

Promises of SW Layered Cake: Standards (16)Web Ontology Language (OWL)Example of MARC 753 Serialized in RDF/OWL pt.2

<?xml version=“1.0”?><rdf:RDF

xmlns:rdf=“http://www.w3.org/1999/02/22-ref-syntax-ns#”xmlns:rdfs=“http://www.w.org/2000/01/rdf-schema#”

xmlns:ss=“http://yz3rj4vl2y.search.serialssolutions.com/serialsSolution/elements/1.0/#”

xmlns:xsd =“http://www.w3.org/2001/XMLSchema#”xmlns:owl =“http://www.w3.or/2002/07/owl#”

Xml:base=“xmlns:ss=“http://yz3rj4vl2y.search.serialssolutions.com/serialsSolution#”>

<owl:Ontology rdf:about=“”/><owl:Class rdf:ID=“AppliedMathematics”>

<rdfs:subClassesOf rdf:resource=“Mathematics” /><rdfs:comment>An Example of OWL Ontology</rdfs:comment> <rdfs:label>Applied Mathematics<rdfs:label>

</owl:Class> <owl:ObjectProperty>, <rdfs:domain>, <rdfs:range>, <owl:DataTypeProperty>, <owl:FunctionProperty>

Page 69: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

68

OWL Web Ontology Language: Semantics and Abstract Syntaxhttp://www.w3.org/TR/owl-semantics/

W3C OWL Web Sitehttp://www.w3.org/2004/OWL/

SWRL: A Semantic Web Rule Language Combining OWL and RuleMLhttp://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage

Promises of SW Layered Cake: Standards and Technologies(17)A Sample Web Ontology Language (OWL) in XML

ToolsProtégé OWL – Ontology Editor for the Semantic Web

http://protege.stanford.edu/plugins/owl/swrl/

Protégé-Frames—User interface and knowledge server to support users in constructing and storing frame-based domain ontologies, customizing data entry forms, and entering instance data: http://protege.stanford.edu/overview/protege-frames.html

Page 70: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

69

Protégé 3.0 beta – family.swrl

Page 71: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

70

SWRL Editor: Protégé 3.0 beta – family.swrl

Page 72: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

71

Promises of SW Layered Cake: Standards (18)SKOS (Simple Knowledge Organization System)

www.w3.org/Consortium/Offices/Presentations/RDFTutorial/#146

Page 73: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

72

Promises of SW Layered Cake: Standards (19)SKOS (Simple Knowledge Organization System)

www.w3.org/Consortium/Offices/Presentations/RDFTutorial/#147

Page 74: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

73

Promises of SW Layered Cake: Standards (20)Topic Map from Mulberrytech

Page 75: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

74

Semantic Web for Managing Library Resources on the Websites

Markup & Apply Accurate Metadata/Subject Analysis Term with Manual and Semi-automatic tools (JN title list);

Develop common semantic structures and data dictionaries (e.g. Master Classification Scheme – LCC) ;

Taxonomy work results in machine addressable schema that enables cross-applications transactions; Web services infrastructure is needed to make content portable (e.g. uPortal, library website, etc.);

Content tagging with w/ topic (LCSH, MESH, AAT, etc.) and LC classification markers;

Aggregation of content through portal/data warehouse channels using Simple Knowledge Organization Systems (SKOS);

Add facets to a category, eg. Location -> Type;

Page 76: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

75

A Sample Snapshot of LC Classification Scheme to Encompass All Library Resources on the Website - Math

Page 77: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

76

A Sample Snapshot of Categorized Course Titles

Page 78: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

77

A Sample Snapshot of Categorized Faculty Specialty by LCC

Page 79: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

78

A Sample Snapshot of Categorized Books Checked Out By Faculty by LCC

Page 80: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

79

A Sample Snapshot of ‘To be Categorized’ JN Titles

Page 81: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

80

A Case Study for St. John’s University Library with Sample Conceptual Model, and no Live Applications Built due to Time, Resources, and Tooling Constraint

1. Continue to maintain single version of true for e-holdings and print holdings, e.g. Serials Solutions and Voyager;

2. Added named entity for product name – MARC 730 field;3. Add subject category browse – MARC 753 field;4. Add facet terms from other thesaurus;5. Add authority control;6. Output e-holdings to library website, WebVoyage, and WorldCat;7. Develop a classification scheme for all resources on library website in

conformance to other resources at enterprise level 8. Develop Web service infrastructure to dynamic insert, update, and

delete of content residing in ECM/Portal/Data warehouse and interchange data among content partners within and outside the institutions;

9. ETL and data cleansing, and automate the process as much as possible with SaaS Solution providers

Page 82: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

81

Page 83: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

82

The Palace Museum (Beijing) 《Qingming Shang He Tu Bu Quan Juan》Author:Zhang Zeduan 、Luo DongpingWebsite: http://www.qingmingtu.com/english/index.htm

Page 84: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

83

References to Typical Set of Automatic Tools and Methodologies Supporting Semantic Web Application Development (1)

Starting Point for Processes:1. Project Management and Enterprise

Architecture 2. Content Capturing 3. Content Management Systems4. Search Engine Services 5. Portal development6. BPM/SOA 7. CRM (Customer Relationship Management)8. Service Resolution Management;

Starting Point for Methodologies:1. RUP (Rational Unified Process) and Agile

Software Dev.;2. Develop project management, enterprise

architecture, SW development and deployment platforms;

3. Modeling on data, processes, systems, and people associated with the SW applications in UML and Entity Diagram;

4. Develop requirements, use cases, functional and technical specifications, testing cases, deployment, release, and acceptance plans;

5. Develop applications with process specific set of tools;

6 D l i t t ti i t di t

Page 85: Mending the Gap between Library's Electronic and Print Collections in ILS and Library's Web Site using Semantic Web - Progress Report

84

References to Typical Set of Automatic Tools and Methodologies Supporting Semantic Web Application Development (2)

Starting point for tools

1. Checkout all the tools that I mentioned in presentation slice 2 and 3; 2. Go to the companies’ websites, download and test their tools;3. Identify and develop your own stack of tools4. Try:

• Protégé OWL – Ontology Editor for the Semantic Webhttp://protege.stanford.edu/plugins/owl/swrl/

• Protégé-Frames—User interface and knowledge server to support users in constructing and storing frame-based domain ontologies, customizing data entry forms, and entering instance data:

http://protege.stanford.edu/overview/protege-frames.html

5. If you are an Oracle user, protégé_oracle_rdf_plugin, ntriple_converter, Oracle RDF Batch Loader Package