maximizing library investments in digital collections through better data gathering and analysis...

Maximizing Library Investments in Digital Collections Through

Better Data Gathering and Analysis (MaxData)

Carol Tenopir ([email protected]) and Donald W. King

([email protected])

Description of MaxData

• Funded by IMLS• University of Tennessee, OhioLink and

Ohio Libraries, University College London CIBER (Centre for Information Behaviour and the Evaluation of Research)

• Comparing and building a cost model for data collection methods (transaction logs and surveys)

Study Objectives

• To compare what different methods of data collection can tell you

• To develop a model that compares costs and effort to the library of collecting and analyzing data from various methods with the benefits of the information you obtain

Study Methods

• Ciber deep log analysis– OhioLink– 5 Ohio Universities– Elsewhere

• User Surveys– 5 Ohio Universities– University of Tennessee– Elsewhere

• Other log data from vendors (COUNTER-compliant, proxy servers, link resolvers, etc.)

• Costs and effort of methods

Why do libraries gather usage data?

1. To make decisions and rethink old ones

2. To demonstrate the value of the library’s collections and services

3. To improve services and collections4. To find their place in comparative

rankings5. Because they are required to

Internally collected or added data can be used to show:

• Comparative amounts of use for databases

• Relative uses per size of user population within subject areas

• Cost per use

• Where users access the digital library

Ciber Highlights

• 50% of journals account for 93% of use, but 99% of titles were used at least once in a 7-month test period

• Health sciences titles are used more than any others• OhioLink users choose to download a full article 3 times

more often than view just an abstract• Half of all sessions viewed an article; average was 2

articles per session• Users who searched (rather than using the alpha or

subject lists) tended to view more articles and viewed more older articles

Usage logs give much useful data, but…

• Logs don’t show why or outcomes

• Requests or downloads may not equal use or satisfaction

• Log sessions may be difficult to differentiate or compare across systems

• For privacy or other reasons, logs do not show behavior by demographic groups

• Logs show only a fraction of total use

Our surveys:

• Have been used since 1970s• Include over 30,000 responses• Provide trends since 1977• University surveys include:

– 2 national surveys of scientists (1977, 1984)– Astronomers and pediatricians who belong to their

main professional societies (2003, 2004)– 3 University of Tenn. Surveys (1993, 2001, 2003)– Drexel University (2002)– University of Pittsburgh (2003)– 2 Australian universities (2004-2005)

Our Surveys are Designed to:• Provide a complete picture of information seeking

and reading patterns and how libraries contribute to overall information needs

• Distinguish:– Sources of articles read– How articles are identified/found– Time and depth of reading– Age of articles read– Format of articles read– Outcomes from reading– Value of reading from library and elsewhere

Our Surveys Establish Factors that Affect Reader Choices:

• Ease of use• Time required to use• Awareness of alternative sources• Attributes of alternative sources• Purposes of use

“Last Reading” is a variation of the Critical Incident Technique that:

• Permits observation of any combination of:– Sources of articles read– Means of identification– Time spend reading– Age of articles read– Format of articles read– Outcomes and value of reading

• Provides comparison:– Over time– Among disciplines– By age, sex/gender– Other types of users

Examples of Observations Over Time (1977 to 2004)

• Medical faculty read most articles (3 times more than humanities or engineers)

• Personal subscriptions and readings from them continue to go down

• Total amount of reading continues to go up• Readings from libraries continue to go up and are

more valuable to purpose and are more often for research

• Both print and electronic sources are used

Average Reading per Faculty member

186215

197216

175206

219

0

50

100

150

200

250

Num

ber o

f Rea

ding

s

U ofTennessee

U ofPittsburgh

Drexel AllScientists

All-Non-Scientists

All US UNSW

Years of Observation

150172

188216

0

50

100

150

200

250

1977 1984 1993 2000-03

Average Articles Read per year per University Scientist

Ave

rag

e n

um

be

r o

f art

icle

s re

ad

pe

r sc

ien

tist

Year of Studies

Source of Additional Readings

37

113

52

120

9296

101

115

0

20

40

60

80

100

120

1977 1984 1993 2000-03

LibraryCollections

Other Sources

Academic Library Collections Source for Increased Readings

• 66 increased total readings; 64 from library collections

• When identified from searches, citations, etc., articles must be located and obtained

• Libraries the logical choice for faculty and students

Factors Leading to this Phenomenon:

• Number of personal subscriptions decreased (on average from nearly 6 to under 2; university scientists in U.S. from 4.2 to 3.5)

• Number of articles identified by searching increased (3 to about 50 articles per scientist)

• Breadth of journal reading increased, due in part to e-journal collections (13 to 23 journals from which at least one article read annually on average)

Critical (Last) Incident Method Can Show Usefulness and Value of Academic Library

Collections• Saves faculty time (15 min/reading)• Library reading rated higher in importance (5.5

vs. 4.7 in 1-7 scale)• Readers take more time reading library articles

(39 vs. 33 minutes)• Achievers read more and use library collections

more than non-achievers• Articles from libraries yield more favorable

outcomes• Articles from libraries help achieve greater

productivity

33.5%

10.3%

56.3%

1st1st YearYear

28.8%

18.1%

53.2%

Library

Personal

Separate

2-5 Years2-5 Years9.2%

17.5%

73.3%

Over 5Over 5 YearsYears

Older articles are judged more valuable Older articles are judged more valuable & are & are more likely to come from more likely to come from librarieslibraries

What is Expected From You:

• Obtain any necessary Human Subjects Permission (or waivers) from your institution

• Review the questionnaire and make suggestions to fit your specific situation

• Decide whether your survey should be web or paper• Send an email or cover letter to your faculty and

students describing the survey • Post a link to the survey on your website if you wish• Publicize to help response rates• Help us identify your i.p. addresses (broadly) for usage

logs if you can

What Is Expected From Us

• Obtain Human Subjects permission at UT (done)• Design and test the questionnaire• Receive responses at the UT secure server• Analyze results• Present survey results to each library• Compare survey results with deep log analysis

of OhioLink logs (with your name removed) in IMLS reports

• Show how the various user methods can be used together

maximizing library investments in digital collections through better data gathering and analysis...

Documents

university surveys

better data

useful data

usage data

elsewhereour surveys

ohio libraries

university of pittsburgh

drexel university