how to use statistics for library decision-making

42
HOW TO USE STATISTICS FOR LIBRARY DECISION-MAKING Diana Very June 27, 2011 1

Upload: elliot

Post on 24-Feb-2016

52 views

Category:

Documents


0 download

DESCRIPTION

How to Use Statistics for Library Decision-making. Diana Very June 27, 2011. MBA for Librarians: Statistics. - PowerPoint PPT Presentation

TRANSCRIPT

How to Use Statistics for Library

How to Use Statistics for Library Decision-makingDiana VeryJune 27, 2011

11MBA for Librarians: Statistics

This program will demystify statistical concepts and skills and illustrate their library applications. The instructor will show how data can and should influence all areas of library operations. Learn about studies, tools and resources to assist you in comparing your data with that from other institutions. Create information, knowledge and stories from numerical and qualitative data to enhance decision making. The goal: Manage smarter. Led by Diana Very.

22

Funding provided for this presentationby IMLS through the LSTA program grant

Diana VeryDirector of LSTA, Statistics, & Research3How many have never used statistics? 44Has anyone received one of these?

100%5Once you had one of these, no one cared how many questions there were on the test. No one cared how hard the test was. No one cared about anything other than you got 100% on your last test!!! Or your child got 100% or the person next to you that you were trying to cheat off of got 100%. Anyway, this is a statistic.5Statistics tell a story What Where When How Why

6Lets start with these questions to begin to learn how to tell your library story. The same way that a childrens story teller creates her story, you want to research your topic to become a library story teller. Find the facts, find the characters, find the fun. 6What is a Statistic?A statistic is a quantity that is calculated from a sample of data. It is used to give information about unknown values in the corresponding population. For example, the average of the data in a sample is used to give information about the overall average in the population from which that sample was drawn.It is possible to draw more than one sample from the same population and the value of a statistic will in general vary from sample to sample. For example, the average value in a sample is a statistic. The average values in more than one sample, drawn from the same population, will not necessarily be equal.

7This whole group is a population. The left side is a sample of the entire population. The right side is another sample of the entire population. Just because we take the average of height of one sample, does not mean that we will obtain the same average in the other population. 7Definitions

8Consider a population consisting of the following eight values:These eight data points have the mean (average) of 5:To calculate the population standard deviation, first compute the difference of each data point from the mean, and square the result of each:Next compute the average of these values, and take the square root:This quantity is the population standard deviation; it is equal to the square root of the variance. The formula is valid only if the eight values we began with form the complete population. If they instead were a random sample, drawn from some larger, parent population, then we should have used 7(which is n 1) instead of 8(which is n) in the denominator of the last formula, and then the quantity thus obtained would have been called the sample standard deviation. See the section Estimation below for more details.A slightly more complicated real life example, the average height for adult men in the United States is about 70", with a standard deviation of around 3". This means that most men (about 68%, assuming a normal distribution) have a height within 3" of the mean (67"73") one standard deviation and almost all men (about 95%) have a height within 6" of the mean (64"76") two standard deviations. If the standard deviation were zero, then all men would be exactly 70" tall. If the standard deviation were 20", then men would have much more variable heights, with a typical range of about 50"90". Three standard deviations account for 99.7% of the sample population being studied, assuming the distribution is normal (bell-shaped).

8Example of Population, Mean, Variance, Standard DeviationConsider a population consisting of the following eight values: 2,4,4,4,5,5,7,9

These eight data points have the mean (average) of 5: (2+4+4+4+5+5+7+9)/8 = 5

To calculate the population standard deviation, first compute the difference of each data point from the mean, and square the result of each:(2-5)2 = (-3)2 = 9 (4-5)2 = (-1)2 = 1 (4-5)2 = (-1)2 = 1 (4-5)2 = (-1)2 = 1

(5-5)2 = (-0)2 = 0 (5-5)2 = (-0)2 = 0 (7-5)2 = (2)2 = 4 (9-5)2 = (4)2 = 16

Next compute the average of these values, and take the square root: (9+1+1+1+0+0+4+16)/8 = 4 = variance square root of 4 is 2 = Standard deviation99Example of Normal CurveThis quantity is the population standard deviation; it is equal to the square root of the variance.

A slightly more complicated real life example, the average height for adult men in the United States is about 70", with a standard deviation of around 3". This means that most men (about 68%, assuming a normal distribution) have a height within 3" of the mean (67"73") one standard deviation and almost all men (about 95%) have a height within 6" of the mean (64"76") two standard deviations. If the standard deviation were zero, then all men would be exactly 70" tall. If the standard deviation were 20", then men would have much more variable heights, with a typical range of about 50"90". Three standard deviations account for 99.7% of the sample population being studied, assuming the distribution is normal (bell-shaped).

1010Normal Curve or Bell Curve

1111Multivariate Analysis

Not as scary as it sounds

Involves observation and analysis of more than one statistical variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest.

Example:During a production process, a number of different measurements such as the tensile strength, brittleness, diameter, etc. are taken on the same unit. Collectively such data are viewed as multivariate data.1212Pearson Correlation1313Public Library Use DeterminantsPopulationTotal CollectionTotalLibraryReferencePublicTotalCirculationVisitsTransactionHoursStaff201010,069,70018,379,13447,155,89539,392,0109,513,049882,7992,99520099,446,29817,856,41447,811,74840,852,1658,734,545907,3163,10520089,319,53217,646,30243,663,62136,979,7787,994,164905,6303,10920079,098,14017,056,94340,816,17535,703,9128,275,923896,8483,01820068,789,52916,496,62440,735,62731,952,3018,547,509887,4003,03820058,650,04616,041,49941,155,34231,557,8968,571,452868,8922,79620048,510,56316,040,93840,269,04831,285,9878,076,037871,4012,821Mean9,126,25817,073,97943,086,77935,389,1508,530,383888,6122,983Median9,098,14017,056,94341,155,34235,703,9128,547,509887,4003,018Mode#N/A#N/A#N/A#N/A#N/A#N/A#N/AVariance541,635,247,329.1,581,668,337,779.19,057,194,894,718.28,421,398,103,038485,335,050,482.728441,888,71629,827Standard Deviation 540,046.06 922,858.37 3,203,367.99 3,912,010.96 511,208.62 15,425.31 126.73 Correlation0.9220.7150.4901414Public Library Use Determinantsby Diana Very, 4/1/2011This hypothesis was based on an assumption that library users only used the libraries for new books and best sellers that are provided when the budget is available to buy them. The library materials budget was used as the independent variable assuming that the circulation was dependent on the amount available in the budget. Using the Pearson correlation coefficient of r for determining the extent to which these variables are related produced an r coefficient of -0.502 which means that there is good evidence that these variables are not correlated.The circulation statistics are generated by the library visits, which would lend itself to project that marketing to more of the library service population would increase circulation and use of library materials rather than spending more money for new materials. When making a decision about marketing budget or materials budget, this study may prove to be helpful.

15This research brings up two points about using statistical analysis: One is the use of Multivariate AnalysisAnd the other is Pearson Correlation15Where to get statistics?Statistics are everywhere. The statistics that you want to use will depend on what decision you want to make from them. Some questions that come up for librariesWho are our customers?Can we bring in more users?16Find out the population of the library system. Find the number of registered users. What is the percentage of the population without cards?16Public Library Statistical Survey - IMLShttp://harvester.census.gov/imls/publib.asp

Public Libraries in the United States: Fiscal Year 2008Release Date: June 2010Revised Date: January 2011

http://harvester.census.gov/imls/pubs/pls/index.asp 1717Academic Library Statistical SurveyAcademic Libraries: 2008 First Lookhttp://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010348 National Center for Education StatisticsFY2008 edition provides stats on 3,827 academic librariesCirculationsPublic Service HoursGate CountCollection Numbers & TypesStaff

18Public School Library/Media Centerhttp://nces.ed.gov/pubsearch/getpubcats.asp?sid=041#

Several reports are available at this site, but only to 2000.

Federal Libraries and Media Centers reports are also available, but not up to date.

Digest of Education Statistics, 2010http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2011015 Contains up-to-date stats for education from kindergarten through graduate school. 19Public Library Data Service Statistical ReportThe survey for 2010 data is the 23rd edition of the annual survey.This report is created from a survey sent to 9,272 valid U.S. and Canadian libraries through web contacts. 1,105 responded to the questionnaire. 2020Census DataHome page of data sets and instructionshttp://www.census.gov/acs/www/

American Community Survey Provides demographics such as population number, races, housing, education, etc., for states, counties, and municipalities http://www.census.gov/acs/www/

Guidance for data users provides instruction on using the data and finding the correct data sethttp://www.census.gov/acs/www/guidance_for_data_users/guidance_main/ 2121Other Examples of Library Stats2011 State of Americas Libraries ALA Releases Annual Report : http://ala.org/ala/newspresscenter/mediapresscenter/americaslibraries2011/index.cfmLibrary Research Service Colorado Library Stats http://www.lrs.org/pub_stats.php Current Look at Georgia Public Libraries FY 2010 http://www.georgialibraries.org/lib/publiclibinfo/

2222Where to find comparative statisticsThis depends on what type of comparisons you want to makeIn Georgia, the library directors want to compare their system with others in GeorgiaIn DeKalb County, Georgia, the library branch managers want to compare their branches to other libraries within the county system.The Public Library Survey from IMLS provides data for states to compare their state data with other states. Peer-to-peer comparisons; Im ok just so Im better than youOh, my!23I dont think thats what we are doing. We want to grow, we want to be the best, to learn from each other and to provide services for library users. Our funders want to know that we are accomplishing our goals and that the services are worth their money. There is always a need in another department to use our funding. We need to justify and quantify our endeavors. 23Compare this year with last yearUse a trend analysis (compares different years of same statistic) for staff motivation, accountability reports, marketing and promotional activities. Stats to Use:CirculationVisitsProgram attendanceGenre CirculationLibrary CardsTry per capita calculationsLibrary Cards per capitaProgram attendance per capitaIdentify the % not participating.

24Finding out how you compare to other libraries helps when asking funding agencies for more money. For instance, If we had as much money as the other library we could accomplish more. 24A table of usable statisticsPopulationTotal CollectionTotalCirculationLibraryVisitsReferenceTrans-actionPublic HoursTotalStaff201010,069,70018,379,13447,155,89539,392,0109,513,049882,7992,99520099,446,29817,856,41447,811,74840,852,1658,734,545907,3163,10520089,319,53217,646,30243,663,62136,979,7787,994,164905,6303,10920079,098,14017,056,94340,816,17535,703,9128,275,923896,8483,01820068,789,52916,496,62440,735,62731,952,3018,547,509887,4003,03820058,650,04616,041,49941,155,34231,557,8968,571,452868,8922,79620048,510,56316,040,93840,269,04831,285,9878,076,037871,4012,8212525How to make the decision Step 11. Whats the situation? 4% budget cut

26How to make the decision Step 22. Decision treeReduce staff already skeleton staffFurlough staff not fair to staffReduce hours possibility Reduce library collection budget cut last year to nearly nothingReduce outreach services agreements already in place

27How to make the decision Step 33. Justify how to reduce hoursCheck into patterns of library useCheck into staff efficienciesCheck into circulation and reference useCheck into website hits and WIFI traffic

28Group WorkName a statistic or set of statistics that will answer: Patterns of library use Staff efficiencies Circulation and Reference use Website and WIFI traffic

29If you want to know about patterns of library use, you would collect what type of data?

Reference contacts by hour

30Patterns of library useVisits by hourProgram attendance by class yearCirculation by material typeInterlibrary loan transactionsReference contacts by hourContacts per public service deskIn-library use by type of materialIn-library use by part of library

30If you want to know about staff efficiencies, you would collect what type of data?

Circulations per FTE

31Efficiency of staffReference turnaround timeInterlibrary loan turnaround timeWork load per staff (such as patrons assisted, circulations, items shelved, materials processed, etc.)MLSs as percentage of total staffCirculations per FTEStaff-to-use ratios per desk by hour

31If you want to know about circulation and reference use, you would collect what type of data?

Collection turnover rate (circulation/collection)Hint hint

32Circulation and reference useCollection turnover rate (circulation/collection)Circulation by material typeInterlibrary loan transactionsReference contacts by hourContacts per public service deskIn-library use by type of materialIn-library use by part of libraryCirculation per capitaNumber of circulations per visitReference transactions per capitaIn-house use per capitaCollection turnover rate (circulation/collection)

32If you want to know about website hits and WIFI traffic, you would collect what type of data?

33Website hits and WIFI trafficNumber of on-line sessionsNumber of files accessedNumber of electronic document transactionsNumber of uses of locally mounted data basesNumber of remote log-ins to library servicesAvailability of on-line catalogsNumber of on-line holdsResponse times33Create a Logic ModelLogic Model TemplateProject TitleGrant PeriodTotal CostProject DescriptionResourcesActivities/MethodsOutputsOutcomesImpactsIn order to accomplish In order to address We expect that We expect changesOrganizational, communityset of activities, we willour problem we willthese activities willin attitudes, behaviors,or procedural level changesneed the following:conduct the followingproduce the followingknowledge, skillsresulted from this project.activities:evidence of serviceresulted from thisdelivery projectName of resourcesName of activitiesNumber of itemsIncreased numberIncreased NumberState, Federal Percentage IncreasePercentage increaseOr Other Funding SourceOther ResultsAnecdotal InformationExemplary Reason34Use this template to record anticipated benchmarks and actual as they come about. 34Example Using the Logic ModelLogic ModelProject TitleGrant PeriodTotal CostOurtown Summer Library Program4/1/11 - 9/15/11$1,000 per library systemProject DescriptionOurtown public library and school library will work together to bring library activity dayevery Wednesday afternoon to the local mall's center staging area. Teenagers will behired as mentors and activities will involve reading and writing stories about animals.ResourcesActivities/MethodsOutputsOutcomesImpactsIn order to accomplish In order to address We expect that We expect changesOrganizational, communityset of activities, we willour problem we willor underway thesein attitudes, behaviors,or procedural level changesneed the following:conduct the followingactivities will produceknowledge, skillsresulted from this project.activities:the following evidenceresulted from thisof service deliveryprojectGrant - LSTA FundingPublicitynumber of patronsnew patronsIncreased attendanceservedStaff Craft ActivitiesIncreased family servicesnumber of computerPatrons comfortable VolunteersReading Storiesclasseswith computer useImproved library servicesComputer Trainingnumber of activitiesLibrary skills increasednumber of days of library behavioral activitiesproblems decreasedbecause of projectattendanceOther ResultsAnecdotal InformationExemplary Reason35Pictures often say more than words

3636Tell the 2011 Clifford Presentation Story1,518 participants 15 programs @ a Cost of 62Per Participanthttp://animoto.com/play/jSIexmvTn8wimFaCnajcbw

37$60 Clifford Costume Rental; $661 Travel Exp; $224 Car Rental / 1518 participantsHow many programs can you do for $0.62 per personhttp://animoto.com/play/jSIexmvTn8wimFaCnajcbw 37Use Statistics to Make Informed DecisionsWhen is the busiest time at the library? Do I need more staff to keep up?Do we really need a new building or can we rearrange the current facility?Should we arrange the fiction by genre or alphabetically by author?Why is our teen collection not being used? Old collection? Hidden in the middle of the picture books? Teens dont know about our books?Would the community support the program if they knew the benefits?

3838Thank you!!Contact information:Diana VeryGeorgia Public Library Service1800 Century Place, Ste. 150Atlanta, GA 30345

[email protected]

39ReferencesStandard Deviation from Wikipedia retrieved on 5/12/2011 from http://en.wikipedia.org/wiki/Standard_deviation PLA - Public Library Data Service Statistical Report. 2010. Presented by the PLA/ALA, ordering information found at http://pla.org/ala/mgrps/divs/pla/plapublications/pldsstatreport/index.cfm IMLS - Public Libraries in the United States: Fiscal Year 2008. Only available on-line at http://harvester.census.gov/imls/pubs/pls/pub_detail.asp?id=130 4040References, cont.Smith, Mark. 1996. Collecting and using public library statistics: A how-to-do-it manual for librarians, Number 56. Neal-Schuman Publishers, Inc. 2011 State of Americas Libraries ALA Releases Annual Report : http://ala.org/ala/newspresscenter/mediapresscenter/americaslibraries2011/index.cfmLibrary Research Service - Colorado Statistics http://www.lrs.org/pub_stats.php Multivariate Analysis Concepts, retrieved from http://support.sas.com/publishing/pubcat/chaps/56903.pdf

4141References, cont. 2Multivariate analysis, retrieved from Wikipedia, http://en.wikipedia.org/wiki/Multivariate_analysisVery, Diana. 2011. Public Library Use Determinants, p 13, e-mail for copy [email protected].

4242