metridoc: a framework for managing and exposing library event data

21
METRIDOC: A Framework for Managing and Exposing Library Event Data With the support of University of Pennsylvania Libraries

Upload: jena-juarez

Post on 04-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

METRIDOC: A Framework for Managing and Exposing Library Event Data. University of Pennsylvania Libraries. With the support of. METRIDOC University of Pennsylvania Libraries. Metrics start with a basic abstraction:. The Event. METRIDOC University of Pennsylvania Libraries. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC: A Framework for Managing and Exposing Library Event Data

With the support of

University of Pennsylvania Libraries

Page 2: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Metrics start with a basic abstraction:

The Event

Page 3: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

xxx.xx.xxx.xxx|-|zucca|[26/Jul/2007:15:41:01 -0500]| GET https://proxy.library.upenn.edu:443/login?proxySessionID=10335905&url=http://www.csa.com/htbin/dbrng.cgi?username=upenn3&access=upenn34&cat=psycinfo&adv=1 HTTP/1.1| 302|0|http://www.library.upenn.edu/cgibin/res/sr.cgi?community=59| Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418.9.1 (KHTML, like Gecko) Safari/419.3| NGpmb6dT6JXswQH|__utmc=94565761;ezproxy=NGpmb6dT6JXswQH; hp=/; proxySessionID=10335514; __utmc=247612227; __utmz=247612227.1184251774.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none);UPennLibrary=AAAAAUaWP5oAACa4AwOOAg==; sfx_session_id=s6A37A3E0-3B8E-11DC-80E985076F88F67F

Viewing an Ejournal article. The Event as raw data

Page 4: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

User & Program

Parameters

User & Program

Parameters

College | Dept

Rank

Course

Host College

Host Dept

Instructor

Grant Spnsr

Library Parameters

Library Parameters

Srvice Genre

Cognzt Staff

Orgn’l Unit

Budget cntr

Environmental Parameters

Environmental Parameters

Bibliographic Parameters

Bibliographic Parameters

Title

URI

Format

Cost| Supplr

Date | Time

Location

IP Domain

URL

EVENT

An Event Abstracted

Page 5: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Link resolver

Proxy server

COUNTER

ILS (Voyager, I3, Kuali-OLE)

Resource sharing system

Web server

Social networking Srvs.

Spreadsheets, databases

Other targets…

The “Event” is represented in machine-readable data, stored in a plethora of business systems.

E-Resource Use by service, demographic, package

Expenditures & Inventory planning /reader interest data

Supply chain data

Discovery systems & content use

Research & instructional datalearning management

Other events…

Event Types Source Target

Page 6: METRIDOC: A Framework for Managing and Exposing Library Event Data

Is a framework for :

Extracting event data from systems

Transforming those data into readable, normalized formats

Loading transformed/normalized payload into a repository

Supporting analysis through local and collaborative dissemination channels.

MetriDoc

METRIDOC University of Pennsylvania Libraries

Page 7: METRIDOC: A Framework for Managing and Exposing Library Event Data

Increased scope of sources

Synthesis of vectors, e.g. Expenditure per use Resources use by communities

Contextualized data with greater statistical dimension and descriptive power.

Collaborative assessment.

Improved Data Resolution Through Integration

METRIDOC University of Pennsylvania Libraries

Page 8: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Our legacy system: Datafarm

Perl

Perl

Perl

cron

Perl

Perl

Perl

cron

Perl

Perl

Perl

cron

Voyager

Farmer

Quaker

App Logs

Page 9: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Datafarm Shortcomings

Maintainability issues•Scripts that depend on each other located in different places•Perl is very productive as long as you are maintaining your own code•Doing the same thing over again, no code reuse•Lack of notification for success and failure

Not shareable•No safe way to expose data for collaboration•Generating data for a report can be a job in itself•Schemas are not stored in a sharable format

Not reusable•Doing the same thing over and over again without building libraries for common tasks•No central code repository to share libraries within and outside of UPenn

Page 10: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

What we need? Who takes care of it

A central scheduler Jenkins

Notifications of job success or failure

Jenkins

Batch job / etl scripting framework

Metridoc

Exposing data Metridoc – Google data format

Reporting / Graphs Google Charts / R / Tableau / Other Stat Packages

Central Code Repository Maven Central via Sonatype Hosting

Page 11: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Current System: Metridoc

Perl

Perl

Perl

cron

Perl

Perl

Perl

cron

Perl

Perl

Perl

cron

Voyager

Farmer

Quaker

App Logs

Page 12: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Metridoc Philosophy

Page 13: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Scripting Framework

Page 14: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Scripting Example

Page 15: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Scripting Example

Page 16: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Exposing data

Page 17: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Metrics on the cheap (google charts)

Page 18: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Thoughts on complex statistics

Page 19: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

The future

Page 20: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Abstracts 4 key functions, exposes interfaces for interoperability

Target Source, e.g. Relais, Illiad, ILS

Ingest Log

Parse

Format

Refined output

1. Extract

Resolution Sources e.g. IdM, WorldCat

Refined output

Resolve Codes & IDs

Normalize

2. Transform

Query Srvc

Data Repo

3. Load

User Interface

LocalData Stores

Results Document

Query Document

4. Query

Page 21: METRIDOC: A Framework for Managing and Exposing Library Event Data

METRIDOC University of Pennsylvania Libraries

Partners are welcome

Spo

nsor

More at http://code.google.com/p/metridoc/