measure up! data analysis tools to optimize library management dr. lesley farmercalifornia state...

36
Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach [email protected]

Upload: esther-little

Post on 19-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Measure Up! Data Analysis Tools to Optimize Library Management

Dr. Lesley FarmerCalifornia State University Long Beach

[email protected]

Page 2: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Research data analytics to assess California school libraries, and identify variables to improve their impactData analysis statistics

Choosing data analysis tools

Agenda

Page 3: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

What significant trends between 2007 and 2012 exist in California school library programs?

What is the profile of a consistently highly (and low) effective school library progarms?

What are the predictors for high – and low -- school library impact over time?

Research Questions Based on 2007 and 2012 California School Libraries Data

Page 4: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Trend analysis of California school libraries

Predictive models of impactful California school libraries, which might be generalizable

Increased use of data analytics to improve libraries

Needs

Page 5: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Use California State Department of Education annual school library survey reports datasets (2007-8 and 2011-2012)

Code survey variables: e.g., meet standard or not Compare school libraries that meet state model

school library standards baseline criteria with those who did not meet standards

Use several statistical techniques: clustering analysis, decision trees, logistic regression

Method

Page 6: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Sample California School Library Reports Distribution

Page 7: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

64 Independent Variables

Page 8: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu
Page 9: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Meet Standard or Not (binary) API (Academic Performance Index) Socio-economic API decile

Dependent Variables

Page 10: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Kth nearest-neighbor (knn) is a clustering method that uses distances between variables to group observations together.

Those with smaller distances between them are assumed to be similar, so

looking closer at the individual clusters can potentially determine important characteristics.

Clustering

Page 11: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Measures the distance between two clusters

Observations with least differences are clustered

Joins “close” clusters so that resulting within-cluster variance is minimized

Ward Method of Clustering

Page 12: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Enhanced access: on weekends, summer Book budget

Important Ward-based Variables

Page 13: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Measure distance between the centroids (means) of each cluster

Join 2 nearest clusters

Centroid Method of Clustering

Page 14: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Centroid Cluster-based Variables

Positive: Access during breaks Internet access Online productivity

tools Reference help

Negative: No access before OR

after school No Internet access No online library

catalog No “extra” funding

Page 15: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Flowchart of decisions and possible consequences Node=test, branch=outcome, leaf=decision Path from root to leaf is classification rule Split data into training set and test set Select “information gain” attribute to separate

data Do tree pruning for optimal selection (aim for

homogeneous class) Useful for predictions

Decision Trees

Page 16: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Online library catalog Internet access Online DBs Video DBs Budget (and funding sources) Collection currency Reference help

Dependent variable: met standards or not

CART (Classification & Regression Trees) Important Independent Variables

Page 17: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Budget (and funding sources)

Collection currency

Online lib rary catalog

Reference help # of books

Dependent variable: met standards or not

C4.5 decision tree (more than binary splits) Important Dependent Variables

Page 18: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Probabilistic statistical classification model Measure relationship between categorical

dependent variable and independent (continuous or categorical) variables

Regression line is nonlinear Run with combination of main effects Aim for best fit Predicts outcome of categorical

dependent variable

Logistic Regression

Page 19: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Backward Selection: start with all variables and remove insignificant ones

Forward Selection: start with 1 significant variable until model is complete

Stepwise Selection: add or remove a variable depending on making model better

Main Effects:Different ways to determine the best logistic

regression model

Page 20: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Use to compare models Distinguishes classifiers that are optimal

under some class and sub-optimal classifiers

Plotting 2 classes: true-positive versus false-positive rates

ROC (Receiver Operating Characteristics)

Page 21: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

DEPENDENT Variable: API

Staffing Online library catalog Collection currency Internet access Online DBs Budget (and fund sources) Reference help

CART Best Model:Ultimate Important Predictable Variables

Page 22: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

What data do you collect?

22

Circulation figuresPatron usageFacilities usageComputer usageInternet usageReference consultations and fillLibrary guides/bibliographies useInstructional sessionsWebsite hits (including tutorials)Database usage vs costILL processing and turnaround timeOrdering, processing, cataloging, preservation, weeding workflow and timeEbook usage vs costLibrary software usage vs costStaff schedulingEquipment maintenance and repairs

Page 23: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

What tools do you use to collect data?

Surveys Web statistics Circulation statistics Interviews and interviews Observation LibQual / LibPAS Flowfinity Document collecting

23

Page 24: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

What do you DO with that data?

Descriptive statistics Analyze workflow for efficiency Reveal trends Benchmark efforts Control quality Do cost-benefit analysis Analyze student learning Optimize scheduling Optimize queuing

24

Page 25: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Data: demographics, staff, resources, services

Use: trends over time, correlations between staff and resources/services,

Demographic correlations with staffing, resources and services

AASL membership correlations with staffing, resources and services

AASL Longitudinal Data

Page 26: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Copyright Median by State

Page 27: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

$/Student by Region 2009-2012

Page 28: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

# of Books/Student by School Level 2009-12

Page 29: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Techniques Correlation analysis (for relationship between continuous variables) Multiple Regression(continuous response

variable), Logistic Regression(categorical response variable)

Decision Trees Principle Components, Factor Analysis Hypothesis testing (paired tests, two sample

tests, ANOVA) Chi-Square tests of independence (for relationship between categorical variables)29

Page 30: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Graphs

Box Plots Stem and Leaf Plots Histograms/Bar Graphs Pareto Charts Pie Charts Time Series Plot Outlier assessment

30

Page 31: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

31

Page 32: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

32

Page 33: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Stem-and-Leaf Plot

33

Page 34: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu
Page 35: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

KM ANALYSIS APPROACH

DATA ANALYTIC TOOLS

Cause identification Fishbone diagram, correlation analysis, regression analysis, ANOVA, clustering, principal components

Cost-benefit analysis / ROI

Pugh matrix, Pearson correlation

Customer satisfaction

Regression analysis, Likert techniques, chi square

Decision Decision tree, Pugh matrixError and tolerance analysis

Pareto analysis, control chart

Failure analysis Pareto analysis, control chart, clusteringJob analysis Demerit systems, flow chartProcess capacity Process capacityQuality analysis Pugh matrix, control chartQuality control Control chart, run chartQuantity analysis Histogram, run chartQueuing Poisson distributionScalability Process capabilityTime analysis Run chart, Poisson distribution, activity

network diagramWork flow and process analysis

Fishbone diagram, activity network diagram, flow chart, run chart

Page 36: Measure Up! Data Analysis Tools to Optimize Library Management Dr. Lesley FarmerCalifornia State University Long Beach Lesley.Farmer@csulb.edu

Let’s talk!http://www.librarydataanalytics.com/

Next Steps