research library statistics 1907-08 through 1987-88 · 2012. 2. 24. · research library...

47
RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 A Guide to till! Machine-Readabk Version of till! Gerould and ARL Statistics Kendon L. Stubbs and Robert E. Molyneux Association of Research Libraries 1990

Upload: others

Post on 13-Sep-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

RESEARCH LIBRARY

STATISTICS

1907-08 THROUGH 1987-88

A Guide to till! Machine-Readabk Version of

till! Gerould and ARL Statistics

Kendon L. Stubbs

and Robert E. Molyneux

Association of Research Libraries 1990

Page 2: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5" 720K diskettes, and 1 Apple Macintosh 35" 800K diskette. Information on the data fIles, programs for reading the files, and on other ways of using the data is included in this guide.

Research Library Statistics is published by the Association of Research Libraries, 1527 New Hampshire Avenue, NW, Washington, DC 20036 (202) 232-2466 FAX: (202) 462-7849.

Apple and Macintosh are registered trademarks of Apple Computer, Inc.

dBASE and dBASE III + are registered trademarks and dBASE N is a trademark of Ashton-Tate Company.

Lotus and 1-2-3 are registered trademarks of Lotus Development Corporation.

Miaosoft and MS-DOS are registered trademarks of Microsoft Corporation.

SAS is a registered trademark of SAS Institute, Inc.

© 1990 Association of Research Libraries

This paper used in this publication meets the minimum requirements of American National Standard for Infonnation Sciences-Pennanence of Paper for Printed Library Materials, ANSI 239.48-1984.

Page 3: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

CONTENTS

I. Research Library Statistics, 1907-08--1987-88 1. Introduction 2. The Gerould Statistics 3. The ARL Statistics 4. Acknowledgements

II. The Diskettes 1. Flies Included on the Diskettes

a. DOS Version b. Apple Macintosh Version

2. Formats of the Files a. DOS Version b. Apple Macintosh Version

III. Contents of the Files 1. Libraries 2. Variables

IV. Missing Data, Errata, and Emendations 1. Missing Data 2. Errata 3. Emendations

V. Using the Files 1. DOS Version 2. Apple Macintosh Version 3. ACRL Statistics

VI. Appendix: Emendations to the Printed Data

VII. Notes

TABLES

5 5 5 6 7

8 8 8 8

10 10 11

14 14 23

35 35 36 36

39 39 41 41

43

46

Table 1: DOS Diskettes 9 Table 2: Apple Macintosh Diskette 10 Table 3: The ARL 1907-08--1961-62 ASCII Format 12 Table 4: The ARL 1962-63--1987-88 ASCII Format 13 Table 5: Libraries in the ARL Data Files, 1907-08--1987-88 15 Table 6: Years Each Library is Represented in the ARL Data 18

Files, 1907-08--1987-88 Table 7: Number of Libraries per Year in the ARL 22

Data Files, 1907-08--1987-88

Page 4: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"
Page 5: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

I. Research Library Statistics, 1907-08 through 1987-88

1. Introduction

The data files on the diskettes enclosed with this guide contain statistics for all current and past members of the Association of Research Libraries. ARL is heir to a statistical series begun by James T. Gerould at the University of Minnesota in 1907-08 and continued until the early 1960's, when ARL began its own annual statistics publication. Except for the fIScal years 1908-09 and 1910-11, data are available for every year since 1907-08. The Gerould-ARL series is one of the oldest continuing series of annual library statistics in the world.

The Gerould-ARL series included on the diskettes presents data for the 107 U.S. and Canadian university libraries that are currently members of ARL; for three university libraries that are former members or predecessors of current members; for 12 current non-university library members; and for a former non-university member. Altogether, 123 of the largest libraries in the U.S. and Canada are represented in these data.

The following sections discuss the Gerould and ARL series in more detail. For simplicity's sake hereafter, references throughout this guide to single years will indicate a fiscal year. Thus, "1908" will mean fiscal year 1907-08.

2. The Gerould Statistics

While at Minnesota, James Gerould first collected library data for fiscal year 1908. When he moved to Princeton in 1920, he continued the annual data collection up to his retirement in 1938. Other Princeton staff members kept up the annual collection through fiscal year 1962. Each year the data collected were typed and distributed to reporting libraries. They were not otherwise published in their entirety until 1986, when ARL issued Robert Molyneux's compilation and analysis, The Gerould Statistics. l Users of this guide and of the machine-readable statistics are urged to consult The Gerould Statistics for full information on this data series.

Gerould began his series with a group of 12 mainly midwestern university libraries (with several Pacific region libraries). By the 1920's the file was expanded to include Ivy League schools, several South Atlantic universities, and others like Colorado. In addition to university libraries, the compilation included colleges such as Bryn Mawr, Oberlin, and Smith. Throughout the 55-year history of the Gerould statistics, 60 universities and colleges reported data in at least some of the years. 51

5

Page 6: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

of these became university members of ARL, and are represented in the machine-readable data.

The relations between the Gerould libraries and the ARL member­ship are of interest. ARL was founded in 1932 with 35 university library members (and 4 non-universities). The inaugural meeting of ARL on December 29, 1932, was called to order by Gerould. The Gerould statistics for 1932 included 35 university libraries, of which 31 were founding members of ARL. It may not be fanciful to think of ARL as an association to which a statistical compilation gave birth.

Throughout most of its 55-year history the Gerould series included only half a dozen categories of data--on volumes held and added, total staff, budget and expenditures for materials and binding, and salaries. It was only in the 1950's that two new categories were added for total expenditures and (curiously) student assistant wages.

The present machine-readable version of the Gerould statistics reproduces the data for 1908-1962 for the 51 university libraries that became members of ARL.

3. The ARL Statistics

ARL began collecting its own statistics of its university members in 1961-62. There is thus an overlap of one year with the "Gerould" statistics compiled at Princeton. By 1962-63 the ARL statistics had frrmly supplanted the Gerould/Princeton series. Molyneux points out (p.8) that ARL followed the same deftnitions as Gerould/Princeton, so that the two series are compatible with one another.

I

The ARL statistics, under various titles, have been published every year since 1962 in increasingly elaborate formats (in 1988, a 99-page booklet).2 A cumulation of the statistics for 1963-1979 was published in 1981.3

As in the case of the Gerould series, the numbers of libraries represented in the ARL statistics have increased greatly since the beginning, as the ARL membership has grown. In 1963, 63 university members are represented. By 1988 this number had grown to 107. In 1975 ARL began to report the data of its non-university members, and has continued including the non-universities in the annual statistics.

Unlike the Gerould data, which maintained a fairly consistent set of data categories throughout its history, the categories in the ARL series have more than doubled since 1963. As recently as 1986 nine new categories were added to the data.

6

Page 7: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

The present machine-readable version of the ARL data includes the data from 1963 through 1988 for both university and non-university members. We begin with 1963 rather than the first ARL issue for 1962 because the 1962 ARL data are flawed by omissions and inconsistencies that present considerable problems for data analysis, and also the cleaner Gerould data are available for 1962. As described in Section IV and the Appendix below, the machine-readable version of the ARL data includes errata and emendations to the printed statistics.

4. Acknowledgements

A debt of thanks is owed to Nicola Daval of ARL, who offered advice and support, and oversaw inputting of the non-university data. This guide and machine-readable version of the 1908-1988 data would not have been possible without the assistance of the following members of the University of Virginia Library staff: Richard B. Martin, who prepared the Apple Macintosh version of the data ftles and the documentation for the ftles; Cristina Sharretts, who helped with dBASE and other data problems; Scott Crittenden, who devised the dBASE programs on the DOS diskettes; and Marie Carter, who typed this guide. We are grateful to William Bowen, President of the Andrew W. Mellon Foundation, for encouraging ARL to produce a machine-readable version of the 1908-1988 research library data.

7

Page 8: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

II. The Diskettes

1. Files Included on the Diskettes

The 1908-1988 research library data are made available with this publication in three floppy disk formats:

3 MS-DOS 5.25" 360K disks 2 MS-DOS 3.5" 720K disks 1 Apple Macintosh 3.5" 800K disk

Each set of disks contains four data files corresponding to the printed statistics as follows:

1908-1%2 Gerould statistics, university libraries: ARL0862 1%3-1978 ARL annual statistics, university libraries: ARL6378 1979-1988 ARL annual statistics, university libraries: ARL 7988 1975-1988 ARL annual statistics, non-university libraries: NONU7588

On the DOS disks these four files have the extension ASC (e.g., ARL0862ASC). On the Macintosh disk the four files are stored in an archive file called ARL text.sit, where each file has the extension text (e.g., ARL0862 text). In addition to the four data files, each set of disks includes some programs for reading the data files, as described below.

a. DOS Version

The distribution of programs and files on the MS-DOS disks is displayed in Table 1.

The data files are in standard ASCII format without field delimiters and with a carriage return at the end of each record and a right-arrow and carriage return at the end of each file. The program files comprise a set of programs for copying the files as ASCII files to a hard disk or for loading the data into dBASE and saving the files as DBF (dBASE) files. For information on these programs and on other ways of using the data, see Section V below.

b. Apple Macintosh Version

The programs and files on the 3.5" 800K Apple Macintosh disk are displayed in Table 2.

8

Page 9: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 1

DOS Diskettes

Disk Filename Type No. of Records Bytes

5.25" 360K Disks

1 SETUP.BAT Program 2,327 1 IMPORTAS.BAT Program 3,402 1 IMPORTDB.BAT Program 800 1 IMPORTDB.PRG Program 3,706 1 ARL0862.DBF Program 514 1 ARL6378.DBF Program 1,346 1 ARL7988.DBF Program 1,346 1 NONU7588.DBF Program 1,346 1 ARL0862ASC Data 1,817 214,407

2 ARL6378ASC Data 1,236 343,609

3 ARL7988ASC Data 1,033 287,175 3 NONU7588ASC Data 165 45,871 3 ERASER.BAT Program 250

3.5" 720K Disks

1 SETUP.BAT Program 2,327 1 IMPORTAS.BAT Program 3,402 1 IMPORTDB.BAT Program 800 1 IMPORTDB.PRG Program 3,706 1 ARL0862.DBF Program 514 1 ARL6378.DBF Program 1,346 1 ARL7988.DBF Program 1,346 1 NONU7588.DBF Program 1,346 1 ARL0862ASC Data 1,817 214,407 1 ARL6378ASC Data 1,236 343,609

2 ARL7988ASC Data 1,033 287,175 2 NONU7588ASC Data 165 45,871 2 ERASER.BAT Program 250

9

Page 10: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 2

Apple Macintosh Diskette

Filename Type No. of Records Bytes

READ ME FIRST! Program 2,196 ARL text.sit Data 4,251 297,973 UnStuffIt 1.5 Program 42,429 ARL0862 Program 26,284 ARL 63- Program 27,861

The data fIles are in text format, stored in a Stufflt archive fIle. The freeware program UnStufflt is also included on the disk. The Macintosh data fIles were created by using the Apple File Transfer program in its text mode to translate the DOS ASCII data to the Macintosh fIle system. Because the DOS ASCII fIles have no field delimiters, each of the four fIles was called up in Microsoft Word 4.0, and a macro from AutoMacIII was used to add tab field delimiters. Each record ends with a standard carriage return, and the end-of-fIle mark is two carriage returns.

For further information on using the Macintosh disk, see Section V below.

2. Formats of the Files

a. DOS Version

Each record in the data fIles contains one observation or case--the data for one institution for one year.

Two record formats are used in the four data meso In the 1908-1962 data (ARL0862ASC) there are 15 variables, and so 15 fields per record. In the 1963-1988 data (ARL6378ASC, ARL7988ASC, and NONU7588ASC) there are 41 variables, and 41 fields per record. The 1908-1962 record has a width of 116 spaces; the 1963-88 record a width of 276.

Tables 3 and 4 display the formats of the 1908-1962 me and the 1963-88 files. The column entitled "First Year in File" indicates the first year in which libraries reported data for a given variable. Before that year all values of the variable are reported as missing values. For example, the variable APPROP was not reported in 1908. Thus, in 1908 each library displays a missing value for APPROP.

10

Page 11: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

As noted above, the DOS record formats do not have field delimiters.

b. Apple Macintosh Version

The Macintosh data flIes follow the formats described above and shown in Tables 3 and 4, with two modifications: (1) The first record in each of the four flIes is the header information with the field names (variable names) in upper case. (2) The fields are delimited by tabs.

11

Page 12: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 3

The ARL 1907-08--1961-62 ASCII Format (File ARL0862.ASC)

Variable Width of Special First Year Name Field Format in File

YEAR 2 1908 INSTNO 4 1908 INAM 27 Character 1908 TYPE 1 Character 1908 REGION 2 1908 MEMBYR 2 1908 EXCH 6 6.4 [1908] VOLS 8 1908 VOLSADG 8 1908 TOTSTF 6 6.2 1908 EXPLMB 10 10.2 1908 FfESAL 10 10.2 1908 SALSTUD 10 1950 STOTEXP 10 1955 APPROP 10 10.2 1910-1954

12

Page 13: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 4

The ARL 1962-63--1987-88 ASCII Format (Files ARL6378.ASC, ARL7988.ASC, NONU7588.ASC)

Variable Width or Special First Year Name Field Format in File

YEAR 2 1963 INSINO 4 1963 INAM 27 Character 1963 TYPE 1 Character 1963 REGION 2 1963 MEMBYR 2 1963 lAW 1 Character 1976 MED 1 Character 1976 EXCH 6 6.4 1963 VOLS 8 1963 VOLSADG 8 1963 VOLSADN 8 1963 MONO 6 1986 SERPUR 6 1986 SERNPUR 6 1986 CURRSER 8 1972 MICROF 8 1968 ILLTOT 6 1974 ILBTOT 6 1974 PRFSTF 7 7.2 1963 NPRFSTF 7 7.2 1963 STUDAST 7 1963 TOTSTF 7 7.2 1963 TOTSTFX 7 7.2 1968 EXPMONO 8 1986 EXPSER 8 1976 EXPOTH 8 1986 EXPMISC 8 1986 EXPLM 8 1963 EXPBND 8 1963 SALPRF 8 1986 SALNPRF 8 1986 SALSTUD 8 1986 TOTSAL 9 1963 OPEXP 9 1963 TOTEXP 9 1963 TOTSTU 6 1967-GRADSTU 5 1968-PHDAWD 5 1963-PHDFLD 5 1972 FAC 5 1986

-Canadian reports: TOTSTU, 1976; GRADSTU, 1976; PHDAWD, 1972.

13

Page 14: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

III. Contents of the Files

Each of the files of the 1908-1988 statistics -- ARL0862ASC, ARL6378ASC, ARL7988ASC, and NONU7588ASC, and the Apple Macintosh versions of these files -- is arranged by years and within each year by library numbers (or equivalently, by library names). Each record, or observation, contains the data for one library for one year. The total numbers of observations (and records) are

ARL0862ASC: 1,817 ARL6378ASC: 1,236 ARL 7988ASC: 1,033 NONU7588ASC: 165

The four files thus contain 4,251 observations.

1. Libraries

The files include data for 123 libraries--ll0 university libraries and 13 non-university libraries. The numbers of libraries represented in each file are

ARL0862ASC: ARL6378ASC: ARL 7988ASC: NONU7588ASC:

51 university libraries 95 university libraries

108 university libraries 13 non-university libraries

107 of the university libraries and 12 of the non-universities are currently members of ARL. Of the remaining three university libraries and one non-university, St.'Louis was a member from 1%3 through 1973; Joint University was a member from 1946 through 1979 but was succeeded by Vanderbilt in 1980; Western Reserve was not a member but its successor, Case Western Reserve, became a member in 1969; and John Crerar Library was a member from 1932 through 1983, after which it was absorbed by Chicago.

Tables 5-7 present information about the libraries in the data files. Table 5 lists the 123 libraries by number and name, together with information on the type of library, region of the U.s. or Canada in which it is located, and the year in which it joined ARL. (For further informa­tion on these variables, see the section below.) Table 6 indicates, for each of the 123 libraries, the years in which it is represented in the data. Table 7 displays the number of libraries (equal to the number of observations or records) covered in each year of the data.

14

Page 15: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 5

Libraries in the ARL Data Files, 1907-08--1987-88

ARL Library Membership Number Library Type Region Year

University Libraries

100 ALABAMA S 6 1967 200 ALBERTA C 10 1969 300 ARIZONA S 8 1967 400 ARIZONA STATE S 8 1973 500 BOSTON P 1 1962 600 BRIGHAM YOUNG P 8 1974 700 BRmSH COLUMBIA C 10 1967 800 BROWN P 1 1932 900 CALIFORNIA, BERKELEY S 9 1932

1000 CALIFORNIA, DAVIS S 9 1969 1050 CALIFORNIA, IRVINE S 9 1981 1100 CALIFORNIA, LOS ANGELES S 9 1937 1200 CALIFORNIA, RIVERSIDE S 9 1979 1300 CALIFORNIA, SAN DIEGO S 9 1973 1400 CALIFORNIA, SANTA BARBARA S 9 1973 1500 CASE WESTERN RESERVE P 3 1969 1600 CHICAGO P 3 1932 1700 CINCINNATI S 3 1932 1800 COLORADO S 8 1964 1900 COLORADO STATE S 8 1975 2000 COLUMBIA P 2 1932 2100 CONNECTICUT S 1 1962 2200 CORNELL P 2 1932 2300 DARTMOUTH P 1 1932 2350 DElAWARE S 5 1983 2400 DUKE P 5 1932 2500 EMORY P 5 1975 2600 FLORIDA S 5 1956 2700 FLORIDA STATE S 5 1962 2800 GEORGETOWN P 5 1962 2900 GEORGIA S 5 1967 2950 GEORGIA TECH S 5 1983 3000 GUELPH C 10 1979 3100 HARVARD P 1 1932 3200 HAWAII S 9 1976 3300 HOUSTON S 7 1975 3400 HOWARD P 5 1971 3490 ILLINOIS, CHICAGO S 3 1988 3500 ILLINOIS, URBANA S 3 1932 3600 INDIANA S 3 1932 3700 IOWA S 4 1932

15

Page 16: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 5 (Continued)

Libraries in the ARL DATA Files, 1907-08--1987-88

ARL Library Membership Number Library Type Region Year

3800 IOWA STATE S 4 1932 3900 JOHNS HOPKINS P 5 1932 4000 JOINT UNIVERSITY P 6 1946 4100 KANSAS S 4 1932 4200 KENT STATE S 3 1974 4300 KENTUCKY S 6 1952 4350 LAVAL C 10 1985 4400 LOUISIANA STATE S 7 1938 4500 MCGILL C 10 1932 4600 MCMASTER C 10 1976 4650 MANITOBA C 10 1981 4700 MARYLAND S 5 1962 4800 MASSACHUSETIS S 1 1968 4900 MIT P 1 1932 5000 MIAMI P 5 1976 5100 MICHIGAN S 3 1932 5200 MICHIGAN STATE S 3 1956 5300 MINNESOTA S 4 1932 5400 MISSOURI S 4 1932 5500 NEBRASKA S 4 1932 5600 NEW MEXICO S 8 1979 5700 NEW YORK P 2 1936 5800 NORTH CAROLINA S 5 1932 5850 NORTH CAROLINA STATE S 5 1983 5900 NORTHWESTERN P 3 1932 6000 NOTRE DAME P 3 1962 6100 OHIO STATE S 3 1932 6200 OKLAHOMA S 7 1962 6300 OKLAHOMA STATE S 7 1962 6400 OREGON S 9 1962 6500 PENNSYLVANIA P 2 1932 6600 PENNSYLVANIA STATE S 2 1962 6700 PITISBURGH S 2 1962 6800 PRINCETON P 2 1932 6900 PURDUE S 3 1956 7000 QUEEN'S C 10 1976 7100 RICE P 7 1971 7200 ROCHESTER P 2 1932 7300 RUTGERS S 2 1956 7350 ST. LOUIS P 4 1963 7370 SASKATCHEWAN C 10 1980 7400 SOUTH CAROLINA S 5 1975 7500 SOUTHERN CALIFORNIA P 9 1962

16

Page 17: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 5 (Continued)

Libraries in the ARL Data Files, 1907-08-1987-88

ARL Library Membership Number Library Type Region Year

7600 SOUTHERN ILLINOIS S 3 1967 7700 STANFORD P 9 1932 7800 SUNY-ALBANY S 2 1975 7900 SUNY-BUFFALO S 2 1967 8000 SUNY-STONY BROOK S 2 1975 8100 SYRACUSE P 2 1962 8200 TEMPLE S 2 1962 8300 TENNESSEE S 6 1962 8400 TEXAS S 7 1932 8500 TEXASA&M S 7 1962 8600 TORONTO C 10 1932 8700 TULANE P 7 1967 8800 UTAH S 8 1962 8850 VANDERBILT P 6 1946 8900 VIRGINIA S 5 1932 9000 VPI & SU S 5 1976 9100 WASHINGTON S 9 1932 9200 WASHINGTON STATE S 9 1962 9300 WASHINGTON U.-ST. LOUIS P 4 1932 9350 WATERLOO C 10 1984 9400 WAYNE STATE S 3 1962 9500 WESTERN ONTARIO C 10 1976 9520 WESTERN RESERVE P 3 9600 WISCONSIN S 3 1932 9700 YALE P 1 1932 9800 YORK C 10 1979

Non-Univemty Libraries

9850 BOSTON PUBLIC LIBRARY N 1 1933 9860 CANADA INST. FOR SCITECH. X 10 1983 9870 CENTER FOR RESEARCH LIBS. N 3 1962 9880 JOHN CRERAR LIBRARY N 3 1932 9890 LIBRARY OF CONGRESS N 5 1932 9900 LINDA HALL LIBRARY N 4 1964 9910 NATL. AGRICULTURAL LIB. N 5 1948 9920 NATL. LIBRARY OF CANADA X 10 1971 9930 NATL. LIBRARY OF MEDICINE N 5 1948 9940 NEWBERRY LIBRARY N 3 1932 9950 NEW YORK PUBLIC LIBRARY N 2 1932 9960 NEW YORK STATE LIBRARY N 2 1968 9980 SMITHSONIAN INSTITUTION N 5 1971

17

Page 18: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 6

Years Each Library is Represented in the ARL Data Files, 1907-08--1987-88

First Library Year Missing Number Library in File Years

University Libraries

100 AIABAMA 1967 200 ALBERTA 1969 300 ARIZONA 1967 400 ARIZONA STATE 1973 500 BOSTON 1963 600 BRIGHAM YOUNG 1974 700 BRITISH COLUMBIA 1967 800 BROWN 1912 900 CALIFORNIA, BERKELEY 1908 1909,1911

1000 CALIFORNIA, DAVIS 1969 1050 CALIFORNIA, IRVINE 1981 1100 CALIFORNIA, LOS ANGELES 1930 1200 CALIFORNIA, RIVERSIDE 1979 1300 CALIFORNIA, SAN DIEGO 1973 1400 CALIFORNIA, SANTA BARBARA 1973 1500 CASE WESTERN RESERVE 1969 1600 CHICAGO 1912 1700 CINCINNATI 1941 1800 COLORADO 1918 1923, 1924, 1931 1900 COLORADO STATE 1975 2000 COLUMBIA 1912 2100 CONNECTICUT 1963 2200 CORNELL 1912 2300 DAR1MOUTH 1914 1915-20, 1952-68 2350 DEIAWARE 1983 2400 DUKE 1929 2500 EMORY 1975 2600 FLORIDA 1957 2700 FLORIDA STATE 1963 2800 GEORGETOWN 1963 2900 GEORGIA 1967 2950 GEORGIA TECH 1983 3000 GUELPH 1979 3100 HARVARD 1914 1915, 1918, 1919 3200 HAWAII 1976 3300 HOUSTON 1975 3400 HOWARD 1971 3490 ILLINOIS, CHICAGO 1988 3500 ILLINOIS, URBANA 1908 1909,1911

18

Page 19: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 6 (Continued)

Years Each Library is Represented in the ARL Data Files, 1907-08--1987-88

First Library Year Missing Number Library in File Years

3600 INDIANA 1908 1909,1911 3700 IOWA 1908 1909, 1911 3800 IOWA STATE 1928 3900 JOHNS HOPKINS 1912 4000 JOINT UNIVERSITY 1937 1980-1988 4100 KANSAS 1908 1909,1911 4200 KENT STATE 1974 4300 KENTUCKY 1952 4350 LAVAL 1985 4400 LOUISIANA STATE 1939 4500 MCGILL 1950 1952-1963 4600 MCMASTER 1976 4650 MANITOBA 1981 4700 MARYLAND 1963 4800 MASSACHUSETTS 1969 4900 MIT 1939 5000 MIAMI 1976 5100 MICHIGAN 1908 1909,1911 5200 MICHIGAN STATE 1958 5300 MINNESOTA 1908 1909,1911 5400 MISSOURI 1908 1909,1911 5500 NEBRASKA 1908 1909, 1911, 1944 5600 NEW MEXICO 1979 5700 NEW YORK 1936 1966 5800 NORTH CAROLINA 1922 5850 NORTH CAROLINA STATE 1983 5900 NORTHWESTERN 1913 1923 6000 NOTRE DAME 1963 6100 OHIO STATE 1908 1909, 1911 6200 OKLAHOMA 1963 6300 OKLAHOMA STATE 1963 6400 OREGON 1922 1952-1962 6500 PENNSYLVANIA 1912 6600 PENNSYLVANIA STATE 1963 6700 PITTSBURGH 1943 1952-1962 6800 PRINCETON 1912 6900 PURDUE 1957 7000 QUEEN'S 1976 7100 RICE 1971 7200 ROCHESTER 1926 7300 RUTGERS 1941 1952-1956 7350 ST. LOUIS 1943 1952-62, 1974-88

19

Page 20: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 6 (Continued)

Years Each Library is Represented in the ARL Data Files, 1907-08--1987-88

First Library Year Missing Number Library in File Years

7370 SASKATCHEWAN 1980 7400 SOUTH CAROLINA 1975 7500 SOUTHERN CALIFORNIA 1963 7600 SOUTHERN ILLINOIS 1967 7700 STANFORD 1912 7800 SUNY-ALBANY 1975 7900 SUNY-BUFFALO 1967 8000 SUNY-STONY BROOK 1975 8100 SYRACUSE 1963 8200 TEMPLE 1941 1952-1962 8300 TENNESSEE 1963 8400 TEXAS 1914 1923, 1925, 1930 8500 TEXASA&M 1963 8600 TORONTO 1950 1952-1962 8700 TULANE 1967 8800 UTAH 1963 8850 VANDERBILT 1980 8900 VIRGINIA 1922 9000 VPI & SU 1976 9100 WASHINGTON 1908 1909,1911 9200 WASHINGTON STATE 1943 1952-1962 9300 WASHINGTON U.-ST. LOUIS 1923 9350 WATERLOO 1984 9400 WAYNE STATE 1963 9500 WESTERN ONTARIO 1976 9520 WESTERN RESERVE 1943 1952-1988 9600 WISCONSIN 1908 1909, 1911, 1929-30 9700 YALE 1912 9800 YORK 1979

Non-University Libraries

9850 BOSTON PUBLIC LIBRARY 1975 9860 CANADA INST. FOR SCITECH. 1983 9870 CENTER FOR RESEARCH LIBS. 1975 9880 JOHN CRERAR LIBRARY 1975 1984-1988 9890 LIBRARY OF CONGRESS 1975 9900 LINDA HALL LIBRARY 1975 9910 NATL. AGRICULTURAL LIB. 1975 9920 NATL. LIBRARY OF CANADA 1975 9930 NATL. LIBRARY OF MEDICINE 1975

20

Page 21: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 6 (Continued)

Years Each Library is Represented in the ARL Data Files, 1907-08--1987-88

Library Number

9940 9950 9960 9980

Library

NEWBERRY LIBRARY NEW YORK PUBLIC LIBRARY NEW YORK STATE LIBRARY SMITHSONIAN INSTITUTION

21

First Year

in File

1979 1975 1975 1975

Missing Years

Page 22: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Table 7

Number of Libraries per Year iu the ARL Data Files, 1907-08--1987-88

University Libraries

1908 12 1935 34 1962 42 1909 1936 35 1963 63 1910 12 1937 36 1964 64 1911 1938 36 1965 64 1912 21 1939 38 1966 63 1913 22 1940 38 1967 70 1914 25 1941 41 1968 71 1915 23 1942 41 1969 76 1916 24 1943 45 1970 76 1917 24 1944 44 1971 78 1918 24 1945 45 1972 78 1919 24 1946 45 1973 81 1920 25 1947 45 1974 82 1921 26 1948 45 1975 88 1922 29 1949 45 1976 94 1923 27 1950 47 1977 94 1924 29 1951 47 1978 94 1925 29 1952 38 1979 98 1926 31 1953 38 1980 99 1927 31 1954 38 1981 101 1928 32 1955 38 1982 101 1929 32 1956 38 1983 104 1930 32 1957 41 1984 105 1931 33 1958 42 1985 106 1932 34 1959 42 1986 106 1933 34 1960 42 1987 106 1934 34 1961 42 1988 107

Non-University Libraries

1975 11 1980 12 1985 12 1976 11 1981 12 1986 12 1977 11 1982 12 1987 12 1978 11 1983 13 1988 12 1979 12 1984 12

22

Page 23: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

2. Variables

As Tables 3 and 4 show, the 1908-1%2 data (ARL0862ASC) contain 15 variables, or fields; and the 1%3-1988 data (ARL6378ASC, ARL7988ASC, and NONU7588ASC) contain 41 variables. In the files taken together, 45 distinct variables are represented.

Research library data collection has evolved from a simple set of data categories by disaggregation of those categories into their constituent parts, and by the addition of new categories. In 1908 libraries reported only volumes held, added volumes, total professional plus non-professional staff, expenditures for materials and binding, and salaries of professional plus non-professional staff: two categories concerning collections, one concerning staff, and two concerning expenditures. These five library variables, in this machine-readable version, grow to 13 by 1963, to 18 by 1976, and to 27, with the addition of 9, in 1986. By 1988, for example, the original category of expenditures for materials and binding had been dis aggregated into the five categories of expenditures for monographs, serials, other materials, miscellaneous, and binding.

The present machine-readable version excludes 15 variables that since 1963 have been reported over the years in the annual ARL statistics or in the Cumulated ARL University Library Statistics, 1962-63 through 1978-79:

1. Reels of microfilm, number of microcards, number of microprint sheets, and number of microfiches (separate counts). Reported in 1969-1982. Aggregated in the machine-readable version as units of microforms (MICROF).

2. Interlibrary originals loaned, photocopies loaned, originals borrowed, and photocopies borrowed. 1974-1985. Aggregated as interlibrary lending and borrowing (ILLTOT and ILBTOT).

3. Expenditures for materials and binding. 1963-1985. Disaggregated as expenditures for materials (EXPLM) and expenditures for binding (EXPBND).

4. Beginning and median professional salary. Reported in Cumulated ARL University Library Statistics for 1963-1979 (beginning salary) and 1969-1979 (median salary).

5. Federal support of universities. Reported in Cumulated for 1971-1979.

6. University expenditures. Reported in Cumulated for 1971-1979.

23

Page 24: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

7. Total part-time students and graduate part-time students. 1976-1988.

By 1988 the ARL data included categories for institutional characteristics, collections, interlibrary loans, personnel, expenditures, and university data. Nevertheless, given that university libraries are academic support units, it is curious that the data over 81 years have concentrated on input measures instead of outputs or measures of service and performance. Except for interlibrary loans, the library variables still concern the inputs of on-site collections, staff, and expenditures. The data are useful for describing the traditional characteristics of research libraries, but not for assessing emerging uses of technology for access to information.

The following notes discuss each of the 45 variables in the 1908-1988 data files. For the years in which each variable has values other than missing values, see Tables 3 and 4.

Institutional Characteristics

1. Year (YEAR): A two-digit number indicating the year of the data: 08, 10, 12, 13, 14 ... 88. "08"= 1907-08, etc.

2. Library number (INSTNO): The code numbers ranging from 0100 to 9800 for the 110 university libraries and from 9850 to 9980 for the 13 non-university libraries. See Table 5.

3. Library name (INAM): The names of the 123 libraries in the data files.

4. Type (lYPE): Five codes are used for TYPE:

C = Canadian university P = private U.S. university S = state controlled or public U.S. university N = U.S. non-university X = Canadian non-university

The designation as public or private is taken from the latest U.S. National Center for Education Statistics Directory of Postsecondary Education. In the data files the same code is used for a given institution for all years of data reported, even though there may have been, in several cases, a changing understanding between 1908 and 1988 of whether an institution was private or public.

24

Page 25: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

5. Region (REGION): A number from 01 to 10 denotes the region of North America in which an institution is located. These codes (with the addition of Canada) are those used by the u.s. Census Bureau, for example in its Statistical Abstract. The regions are

01 = New England 02 = Middle Atlantic 03 = East North Central 04 = West North Central 05 = South Atlantic 06 = East South Central 07 = West South Central 08 = Mountain 09 = Pacific 10 = Canada

6. Year joined ARL (MEMBYR): As noted above, 119 of the libraries in the data file are current ARL members. Three others (Joint University, St. Louis, and John Crerar Library) were members in the past; and Western Reserve, while not a member, was succeeded by a current member, Case Western Reserve. For the years since 1%3 MEMBYR is useful for selecting those libraries for which data are present in any given years. For example, to consider only those libraries that reported data for each of the 10 years 1979 through 1988, select the libraries for which MEMBYR is 79 or less.

In general, from 1%3 new members began reporting data in the year in which they joined ARL, and reported in each year thereafter. There are, however, five exceptions to this guideline: (1) Dartmouth joined in 1932 but did not begin reporting until 1%9. (2) McGill joined in 1932 but first reported in 1%4 (after reports in 1950 and 1951). (3) Massachusetts joined in 1968, but first reported in 1%9. (4) New York joined in 1936 and reported for all years but 1966. (5) St. Louis joined in 1%3 and reported for each year of 1%3 through 1973, when its membership ended.

For the years 1908-1%2 MEMBYR is less useful in selecting reporting libraries, because the 1908-1962 data include non-members and exclude members in some years after the formation of ARL in 1932.

7. Law library included (LAW): "Y" or "N" indicates whether a law library is included in the reported data during the years 1976-1988. Most libraries uniformly included or excluded a law library for each year of the 1976-1988 period. As a general rule, if a law library is reported in 1988, it is also reported for every other year; and if not reported in 1988, it is not reported in any other year. Some exceptions to this rule of consistency are explained in footnotes to the printed data. For example,

25

Page 26: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Tennessee reported law data in 1976-1979, did not report law in 1980-1987, and in 1988 resumed including its law library in its reports.

There are four unexplained inconsistencies. Hawaii claims a law library in 1976 and 1977 but not in any following year. Harvard includes its law library in all years except 1977 and 1986. The National Agricultural Library includes a law library only in 1980 and 1981. The Library of Congress excludes its law library in 1976-1979, includes it in 1980-1985, excludes it in 1986, and includes it in 1987 and 1988. For discussion of these inconsistencies, see Section IV below on emendations, as well as the Appendix.

8. Medical library included (MED): "Y" or "N". The comments on consistency in LAW also apply to MED. New Mexico, for example, explains in footnotes how it excludes its medical library through 1984, includes it in 1985 and 1986, and excludes it again in 1987 and 1988.

MED has seven unexplained inconsistencies: (1) Alabama -- N, 1976; Y, 1977; N, 1978-1981; Y, 1982-1988. (2) Brown -- N, 1976-1977; Y, 1978-1979; N, 1980-1988. (3) Harvard -- Y, 1976; N, 1977; Y, 1978-1985; N, 1986; Y, 1987-1988. (4) Louisiana State -- N, 1976; Y, 1977-1979; N, 1980-1984; Y, 1985-1986; N, 1987-1988. (5) Rutgers -- Y, 1976-1985; N, 1986-1988. (6) Utah -- N, 1976-1978; Y, 1979; N, 1980-1988. (7) New York State Library -- Y, 1976-1977; N, 1978-1979; Y, 1980-1988. Again, see Section IV on emendations and the Appendix.

9. Canadian exchange rate (EXCH): In the printed statistics through 1979 the expenditures of Canadian libraries are reported in Canadian dollars. From 1980 through 1988 the printed Canadian expenditures are converted to U.S. dollars for comparability with expenditures in U.S. institutions. In the present machine-readable files all Canadian expendi­tures for all years are converted to U.S. dollars. The exchange rate used for each year is the annual average of the average monthly noon exchange rates published by the Bank of Canada. For recent years these exchange rates are taken from the Bank of Canada Review; for earlier years from the Canada Year Book.

Canadian libraries appear in the data in 1950 and 1951 (when McGill and Toronto are the only Canadian libraries represented), and in 1%3-1988. For each of these years EXCH for each Canadian library is the exchange rate of Canadian dollars per U.S. dollar. For example, in 1988 EXCH for Canadian libraries is 1.2826. For convenience' sake,· EXCH is set equal to 1.0000 for U.S. libraries. Those who want to convert Canadian expenditures from U.S. equivalents back to Canadian dollars need only multiply expenditures (e.g., EXPLM, TOTEXP) by EXCH.

26

4 I

Page 27: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

u.s. expenditures are unaffected by this multiplication, while Canadian expenditures are converted to Canadian dollars.

Collections

10. Volumes held (VOLS): With one exception (St. Louis in 1966), VOLS is reported by every library in every year of these data, from 1908 through 1988. VOLS in fact would probably win a popularity contest as the library statistic traditionally thought to be most significant and most worthy of reporting. It is the more curious that there continues to be little agreement on what VOLS means. In the fIrst place, it is at least rodomontade for a library to report, say, 3,003,066 volumes. This is to claim accuracy to within better than one part in every million. Even an accuracy of one in ten thousand is perhaps unlikely in research libraries. More importantly, there have been changing conceptions of what to include in VOLS. In the early sixties various libraries included volume­equivalents for microforms. When it became generally understood that microforms were to be excluded, some libraries showed decreases in volumes held in the annual statistics. Even up to the present, some libraries catalog documents or technical reports and count them as volumes; others do not. In some libraries fascicles and Lieferungen may be counted as volumes; in others only fascicles when bound together as volumes may be counted. It is peculiar to fInd so much diversity in what is still taken to be the preeminent library statistic. Nevertheless, VOLS has a long, distinguished history as a predictor of library size, measured in collections, staffIng, or expenditures.

11. Volumes added, gross (VOLSADG): VOLSADG is intended to denote volumes cataloged (however individual libraries may interpret "cataloged"). VOLSADG does not necessarily measure volumes acquired. Thus, increases and decreases in VOLSADG are changes in numbers of volumes cataloged, not of volumes acquired.

One value of VOLSADG deserving special attention is Chicago's 523,485 gross volumes added in 1985. In that year Chicago absorbed the John Crerar Library; so 523,485 is a legitimate value. Nevertheless, including this value in calculations of averages and other statistics produces misleading results.

12. Volumes added, net (VOLSADN): For a given institution VOLSADN should equal one year's volumes held minus the preceding year's volumes held. This equation does not hold true in some cases, especially in the early years of the data.

In six instances in the printed statistics VOLSADN is a negative number (that is, typically a library withdrew more volumes than it

27

Page 28: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

cataloged). In order to avoid possible problems in the use of the computer data file, these six instances have been coded here as missing values. For a list of the six negative values, see the Appendix.

13. Monographs purchased (MONO): This variable is intended to measure monographic volumes (not titles), including monographs in series, purchased during the year. EXPMONO divided by MONO for any library should be the average price per monographic volume.

14. Current serials purchased (SERPUR): SERPUR is subscriptions to current serials (including periodicals but excluding monographic series) paid for by the library. Note that SERPUR refers to copies, not titles. Three paid subscriptions to Science are counted as three SERPUR, though this is one serial title. EXPSER divided by SERPUR should be the average price per current serial.

15. Current serials not purchased (SERNPUR): These are current serials for which the library does not pay a subscription price. SERNPUR includes government documents, exchange items, etc.

16. Total current serials (CURRSER): Nowadays CURRSER = SERPUR + SERNPUR. For 1972 through 1974 when this variable first appeared in the annual statistics, CURRSER was called "Current Periodicals"; from 1975 to the present it has been called "Current Serials". But even when this category was designated as periodicals, the instructions in the ARL statistics questionnaire used the same defmition of periodicals that is now used of serials. As a result, there has undoubtedly been considerable uncertainty as to what is to be counted in this category. We find, for example, that one library reports 32,000 CURRSER one year, 19,000 the next; another drops from 19,000 to 13,000 the following year; another increases from 7,000 to 13,000 after one year. Curiously, for various libraries CURRSER decreased from 1974 to 1975, although one might have supposed that the change in the ARL category from periodi­cals to serials would have led to increases. One would be well-advised to take this category, before about 1975, with a grain or two of salt.

There are also differences among libraries in respect to whether materials like government documents and monographic serials are counted in CURRSER. SERNPUR in recent years offers some help in sorting out government documents included in counts of serials.

17. Microforms (MICROF): MICROF should be the total count of reels of microfilm, number of microcards, number of microprint sheets, and number of microfiche sheets. Title counts of microforms are not reported in the ARL statistics.

28

Page 29: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Interlibrary Loans

18. Total lending (ILLTOT)

19. Total borrowing (ILBTOT)

ILL TOT and ILBTOT represent interlibrary transactions, not items or volumes. If a library lends a three-volume set, this is counted as one ILLTOT. Note also that these are filled requests, rather than the larger categories of received or sent requests.

Personnel

20. Professional Staff (PRFSTF): For 1963 through 1%7 fractional parts of PTE professionals were permitted by ARL instructions. Consequently, in the early years there are reports of 47.5 or 21.7 or 159.74, etc., professionals. From 1%8 the numbers of PTE professionals are rounded tb the nearest whole number.

PRFSTF has been susceptible to differing interpretations over the years. First, the criteria for determining professional status even now vary among libraries; and so it has been left to each library to decide who is to be counted as a professional. Second, there has been some uncertainty whether to count only filled positions or ftlled plus budgeted but temporarily vacant positions. Third, the definition of PTE has been left to each library to define on the basis of its own work week.

21. Non-professional staff (NPRFSTF): Most of the comments on PRFSTF above also apply to NPRFSTF. NPRFSTF is less pejoratively termed support staff in other data collections.

22. Student assistants (STUDAST): For 1%3 through 1%7 the data are reported in hours; thereafter in full-time equivalents, rounded to the nearest whole number.

23. Total professional and non-professional staff (TOTSTF): TOTSTF = PRFSTF + NPRFSTF. Together with VOLS and VOLSADG, this is one of only three variables represented in all years of all the machine­readable ftles. In the printed statistics TOTSTF is reported in all years of the 1908-1%2 cumulation, but only in the years 1%3-1974 of the annual ARL statistics. In 1975 TOTSTF was replaced in the printed statistics with TOTSTFX (see below). Both TOTSTF up to 1974 and TOTSTFX from 1975 have been called "Total Staff" in the printed statistics.

29

Page 30: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

For the machine-readable files for 1963-1988, TOTSTF has been newly calculated from the data. The only instances of missing values for TOTSTF are Harvard, 1914-1929; Iowa, 1957; and Yale, 1929 and 1964.

24. Total professional, non-professional, and student assistant staff (TOTSTFX): TOTSTFX = PRFSTF + NPRFSTF + STUDAST. In the annual printed statistics TOTSTFX appears (under the heading "Total Staff") for the years 1975-1988. For the machine-readable files for 1968-1988 TOTSTFX was newly calculated from the data. Before 1968 STUDAST was reported in hours, so that the addition of PRFSTF + NPRFSTF + STUDAST was not feasible; and total staff was represented by TOTSTF (PRFSTF + NPRFSTF).

There are five cases of missing values for TOTSTFX among university libraries and a number of cases among non-university libraries. For details see Section IV and the Appendix.

Expenditures

25. Expenditures for monographs (EXPMONO): Expenditures for the monographic volumes reported as MONO. EXPMONO divided by MONO should be the average price per monographic volume in a given library.

26. Expenditures for current serials (EXPSER): For 1976 through 1978 EXPSER was called "Current Periodicals" in the annual ARL statistics; from 1979 it has been called "Current Serials." But in the ARL statistics questionnaires for 1976 and 1977 the same defmition of periodicals was used that was then used of serials in 1978 and later. See CURRSER above for other comments on the periodicals-serials ambiguities in the ARL data.

Just as libraries seemed uncertain in the 1970's what to count in the category CURRSER, similarly they wavered in their interpretations of EXPSER. One library reported $500,000 one year, $800,000 the next, but an increase of only 300 CURRSER in that period; another library reported a decrease of $100,000 in EXPSER but an increase of 1,700 CURRSER; and so on. Like CURRSER, EXPSER needs to be handled with much caution, at least in the data from the 1970's.

In recent years EXPSER divided by SERPUR should be the average price per serial in a given library.

27. Expenditures for other materials (EXPOTH): EXPOTH is intended to include expenditures for on-site materials other than monographs and

30

Page 31: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

serials: in particular, microforms, audiovisual materials, maps, manu­scripts, and similar materials.

28. Expenditures from the materials fund for items other than materials (EXPMISq: Some libraries spend part of their materials budget on items such as bibliographic utilities, literature searching, memberships, etc. These expenditures are represented in EXPMISC.

29. Total expenditures for materials (EXPLM): Nowadays EXPLM = EXPMONO + EXPSER + EXPOTH + EXPMISC. EXPLM IS

cbmmonly (but inaccurately) referred to as book fund expenditures.

It is sometimes assumed that for a given library the result of EXPLM divided by VOLSADG is a price per volume. This is at best only approximately true. There is more in EXPLM than expenditures for volumes; and there is more, or less, in VOLSADG than volumes acquired. For prices per volume or per serial subscription, since 1986 one can turn to EXPMONO/MONO and EXPSER/SERPUR.

30. Expenditures for binding (EXPBND): This is intended to be expenditures for contract binding, but not for other kinds of preservation.

31. Expenditures for materials and binding (EXPLMB): In the printed statistics this is reported for 1908-1985. In the machine-readable version it is retained only in the 1908-1962 file (which does not include EXPLM or EXPBND separately). For 1963-1988 EXPLMB can be calculated as EXPLM + EXPBND.

32. Professional salaries (SALPRF): SALPRF, like the other variables for salary and wage expenditures below, is not intended to include fringe benefits, but in some cases these costs may have been included.

33. Non-professional salaries (SALNPRF): See SALPRF.

34. Student assistant wages (SALSTUD): See SALPRF.

35. Total salaries and wages (TOTSAL): Nowadays TOTSAL SALPRF + SALNPRF + SALSTUD.

36. Professional and non-professional salary budget (FTESAL): FTESAL, reported for 1908-1%2, appears to be a budget rather than expenditure variable for part of this period. FTESAL as reported in the 1909-10 statistics, for example, is the salary budget for fiscal year 1910-11. By 1%3 the statistics are reporting TOTSAL, an expenditure figure, for the year covered by the report; that is, TOTSAL for 1%2-63 is the salary expenditure for 1962-63. By the 1950's FTESAL had evolved from

31

Page 32: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

a budget to an expenditure variable (see STOTEXP below). After 1962 the closest analogue to FfESAL would be SALPRF + SALNPRF.

37. Other operating expenditures (OPEXP): OPEXP is a catch-all category for any expenditures from the library budget other than expenditures for materials (EXPLM), binding (EXPBND), and personnel (TOTSAL). In recent years in some institutions the costs of library automation are included in OPEXP; in others they are assumed instead by the universities. OPEXP generally does not include expenditures for buildings, maintenance, and fringe benefits.

38. Total expenditures (TOTEXP): TOTEXP = EXPLM + EXPBND + TOTSAL + OPEXP. For the machine-readable files for 1963-1988 TOTEXP was newly calculated from the data. There are 14 cases of missing values for TOTEXP among university libraries and a number of cases among non-university libraries, due almost always to a missing value for EXPBND. For details see Section IV and the APPENDIX.

39. Total expenditures for materials, binding, salaries, and wages (STOTEXP): This variable is reported only for 1955-1962. For those years STOTEXP in most cases equals EXPLMB + FfESAL + SALSTUD. This equality implies that by 1955 FfESAL was being interpreted as an expenditure, rather than a budget (see FfESAL above). For the years after 1962 an equivalent to STOTEXP would be EXPLM + EXPBND + TOTSAL.

40. Budget for materials (and binding) (APPROP): APPROP, reported for 1910-1954, is a budget rather than expenditure variable. As reported in the 1909-10 statistics, for example, APPROP is the budget for fiscal year 1910-11. APPROP certainly refers to the materials budget; whether it also includes binding is not clear. There is no equivalent variable to APPROP after 1954.

University Data

41. Total full-time student enrollment (TOTSTU): For U.S. institutions TOTSTU is reported for 1967-1988; for Canadian institutions, 1976-1988. The sources of these data are: U.S. institutions, 1967-1979, from computer tapes prepared by the U.S. National Center for Education Statistics, from its HEGIS Data Files OPE, Opening Fall Enrollment; U.S. institutions, 1980-1988, from annual ARL statistics; Canadian institutions, 1976-1988, from annual ARL statistics. These data are for total full-time students. Note that TOTSTU is the headcount of full-time undergraduates, first-professional students, graduate students, and

32

Page 33: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

unclassified students; part-time students are not included here. Note also that, for example, TOTSTU for 1988 (= fiscal year 1987-88) is the opening fall enrollment for fall, 1987.

Some institutions have variously included or excluded branch campuses in reporting ARL data throughout the years. For instance, Pennsylvania State reported data for its Hershey Medical Center through 1971, then excluded it thereafter, although other branches continue to be recorded. In some cases, especially in the 1960's and 1970's, it is not possible to determine exactly which branches some institutions were including in the ARL statistics.

The eccentricities that characterize some of the library variables are at least as prevalent in TOTSTU. For example, one institution claims 15,000 students one year, 10,000 for the next two years, and 16,000 thereafter. A user should probably be even more careful in drawing conclusions from TOTSTU and the following university variables than from the library data.

42. Total graduate student enrollment (GRADSTU): For U.S. institu­tions GRADSTU is reported for 1968-1988; for Canadian institutions, 1976-1988. The sources of the data are: U.S. institutions, 1968-1979, as above for TOTSTU; U.S. institutions, 1980-1988, and Canadian institu­tions, 1976-1988, from annual ARL statistics.

In addition to peculiar values of GRAD STU for some individual institutions, there are two discontinuities in the data. (1) For a number of U.S. institutions there are noticeable decreases in GRADSTU from 1969 to 1970, perhaps due to faulty government data in those years. (2) In 1986-87 the U.S. Department of Education changed its way of reporting graduate students. Up through 1985-86 (= ARL data year 1986), first-professional students were excluded from GRADSTU. Beginning with 1986-87 (= ARL data year 1987), first-professional students are included in GRADSTU. This change led to sharp increases from 1986 to 1987. The discontinuity is so pronounced, in fact, that it is difficult to use GRAD STU in valid time series analysis for the period before and after 1986/1987.

43. Ph.D.'s awarded (PHDAWD): For U.S. institutions PHDAWD is reported for 1963-1988; for Canadian institutions, 1972-1988. The data sources are: U.S. institutions, 1963-1977, from a computer tape prepared by the National Research Council, Commission on Human Resources, from its Doctorate Records File; U.S. institutions, 1978-1988, and Canadian institutions, 1972-1988, from annual ARL statistics. The NRC data include Ph.D.'s, Sc.D.'s, and D.MA.'s for 1963-1972; Ph.D.'s and Sc.D.'s for 1973-1977. The data from the annual ARL statistics are

33

Page 34: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

intended to include only Ph.D.'s, but in some institutional reports may include other doctoral fields.

44. Ph.D. fields (PHDFLD): PHDFLD has been recorded in the annual ARL statistics since 1972. It is intended to indicate the number of subject fields in which Ph.D. degrees may be awarded in an institution. Over the years there has been some variation in whether institutions are counting broad fields (e.g., English) or narrower sub-disciplines (e.g., English literature, American literature, creative writing, etc.).

45. Instructional faculty (FAC): FAC has been recorded in the annual ARL statistics since 1986. FAC is intended to include only full-time (not part -time) faculty whose major regular assignment is instruction. Clinical faculty, for example, are excluded.

34

Page 35: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

IV. Missing Data, Errata, and Emendations

1. Missing Data

In the printed editions of the research library statistics, missing data values are represented in seven different ways. In The Gerould Statistics for 1908-1962 missing values are uniformly represented by asterisks. In the successive annual issues of the ARL statistics for 1963-1985 "N / A" or occasionally blanks or one, two, or three hyphens indicate missing values. For the years 1986-1988 the printed statistics use "V/A" (unavail­able) and "N/A" (not applicable). In the Cumulated ARL University Library Statistics, 1962-63 through 1978-79 blanks represent missing values.

Two additional ways of indicating missing values appear in machine­readable data sets. A computer tape of the cumulated ARL statistics for 1963-1979 represented missing values by zeros.4 Similar to the ARL statistics are the ACRL academic library statistics for 1979-1988. A machine-readable version of those data uses "." (a period) for missing values.s

Thus, missing values are represented variously by *, [blank], -, - -, ---, N / A, V / A, ., and O. Note that before 1986 for university libraries there is no variable in the ARL data for which "0" or "not applicable" is a valid response (except for the theoretical case, which has not occurred in these data, in which a library withdrew exactly as many volumes as its gross additions, so that net volumes added (VOLSADN) would be 0). Except for VOLSADN, all pre-1986 variables admitted only positive values greater than O. Consequently, up to 1986 all missing values, however represented, implied unavailable data. In 1986 ARL introduced several variables for which 0 or not applicable may be a valid response. For example, EXPMISC--expenditures from the materials budget for bibliographic utilities, electronic services, etc.--sometimes has reported values of 0 or N/A (or has missing values, VIA). In fact, 0 does not appear in the printed university library statistics before 1986. In the 1986-1988 university library statistics 0 appears 55 times--52 times under EXPMISC, and 3 times under SERNPVR. N / A appears 10 times under EXPMISC, EXPOTH, SERNPVR, MONO, and FAC.

In non-university libraries, on the other hand, 0 or not applicable could be an appropriate response for variables such as STVDAST (student assistants) or ILBTOT (interlibrary borrowing). Even before 1986 zeros can be found as legitimate values in non-university statistics.

In order to treat missing values consistently, therefore, in this machine-readable version of the 1908-1988 data the following indicators are employed:

35

Page 36: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Printed Version

University Libraries, *, [blank], 1908-1985: -, N/A

University libraries, U/A 1986-1988: N/A

Non-university libraries, N/A, [blank] 1975-1985: 0

Non-university libraries, U/A 1986-88: N/A

0

Dos Version

[blank]

[blank] 0 0

[blank] 0

[blank] 0 0

Apple Version

o o

o

o o

In summary, in the machine-readable files mlssmg values are uniformly represented by blanks (i.e., fields with spaces) in the DOS files and periods in the Macintosh files. 0 or not applicable is represent­ed by 0 in both DOS and Macintosh.

2. Errata

From 1963 libraries have reported to the ARL office errata to the annual printed statistics. In recent years these errata have usually, but not always, been reported in theARL Newsletter. These errata are incor­porated in the present machine-readable version of the statistics.

3. Emendations

Up to this point the machine-readable ftles faithfully reproduce the printed statistics, except for different indicators of missing values and the inclusion of errata. One could send the ftles out into the world without further changes. But as Housman pointed out, in a famous comment on editing manuscripts that applies equally to the ARL data, "Chance and the common course of nature will not bring it to pass that the readings of a MS are right wherever they are possible and impossible wherever they are wrong.,,6 Kruskal adds that "A reasonably perceptive person, with some common sense and a head for figures, can sit down with almost any structured and substantial data set or statistical compilation and fmd strangelooking numbers in less than an hour."? Certainly there are strange-looking numbers in the 1908-1988 data: a library that reports 57 staff in 1948, 109 in 1949, and 65.25 in 1950; another library that has 90

36

Page 37: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

professionals in 1983, 68 in 1984, and 97 in 1985; a university that claims 15,000 students one year, 10,000 the next two years, and 16,000 thereafter; and so on. It is very tempting to go through the statistics emending these instances of what Kruskal calls lack of smoothness, transgressions of general knowledge, and ambiguous classifications.

The present machine-readable version of the 1908-1988 statistics, however, adopts a conservative stance towards emendation. Of the 127,049 data values in the 1908-1988 fIles, 105 are emended according to the following principles:

a. TOTSTF, TOTSTFX, and TOTEXP have been newly calculated from the data for this machine-readable version. The computed values differ from the printed values in 59 cases. For example, Yale reports $2,126,067 in TOTEXP in 1%3, but because Yale's EXPBND is missing, the computed TOTEXP is also missing; and the computed value is substituted for the printed value. As another example, the Smithsonian reports $2,981,328 in TOTEXP in 1980, but the components of TOTEXP actually add up to $2,980,828, which value is used in the machine-readable data.

b. In six instances VOLSADN is negative in the printed statistics. These values are changed to missing to avoid possible confusion at the presence of six negative numbers among over 100,000 positive values.

c. Some values are unquestionably errors in the printed data. For example, "N/A" (not applicable) for FAC at Cornell in 1986 is impossible, since Cornell does have teaching faculty; so N / A is emended to missing (a blank field). Brown has no medical library, and so "Y" under MED in 1978 is emended to "N". The National Library of Medicine by convention is coded "Y" throughout, though the printed statistics display "Y" for only 1986-1988. There are 13 emendations of this kind.

d. Finally, there are 27 values that are possible but so unlikely that they cry out for emendation. 15 are in the LAW and MED variables. For example, Harvard reports "Y" for both LAW and MED in all years but 1977 and 1986. Nothing in the other data indicates that Harvard failed to include its law and medical libraries in those two years. It is reasonable to emend "N" to "Y" for 1977 and 1986. The other 12 emended values involve problems with 0 or N / A. Georgia reports N / A for MONO in 1987 but U / A in 1986 and 1988. It is unlikely that in 1987 expenditures for monographs were not applicable (=0). N/A is therefore changed to missing. The Boston Public Library reports expenditures for binding in 1985 and 1988 but 0 in 1986 and 1987. It is again unlikely that the Boston Public bound books and serials in 1985 and 1988 but not in the two intervening years. With one missing year Newberry records

37

Page 38: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

binding expenditures in every year but 1988, when it reports O. All of these cases seem to be examples of the misuse of 0 or N/A where a missing value is intended.

The 105 emendations described here are listed in the Appendix, which displays the printed value and the value in the machine-readable data.

38

Page 39: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

v. Using the Files (see also p. 47)

1. DOS Version

In addition to the four ASCII data fIles, the DOS diskettes also contain programs to copy the fIles to a hard disk in ASCII or dBASE format. For the dBASE format you must have dBASE III + or IV in a directory C:\DBASE. To use the programs, insert disk 1 of the 360K or the nOK floppies in your disk drive. At the C:\ prompt type "A:SETUP" (or if you are using a B drive for the nOK diskette, use a DOS "ASSIGN A=B" statement). Then follow the instructions in the menus. These programs will copy all four fIles -- ARL0862, ARL6378, ARL 7988, and NONU7588 -- to your hard drive.

To copy only selected fIles, use DOS commands to make a directory or change to the directory into which you want to copy the fIles. Then

. use the DOS Copy command to copy, say, ARL7988ASC from your A drive to the target directory. For dBASE the fIle structure of the four data fIles is contained on disk 1 in ARL0862.DBF, ARL6378.DBF, ARL7988.DBF, and NONU7588.DBF. To open fIle ARL7988, for example, at the dot-prompt type ".use A:ARL7988.DBF." Then type ".append from A:ARL7988ASC type SDF." These commands will transport the ASCII data in fIle ARL7988ASC into a dBASE fIle structure, resulting in a dBASE-compatible fIle named ARL 7988.DBF; and you can then also save the fIle as type SDF (ASCII), DIF, etc.

It is important to realize that the four data fIles represent a relatively large amount of data. Some programs can handle this amount of data without problems. PC SAS, for example, can create a SAS dataset from either the ASCII fIles or from a DBF (or DIF) format. To create a SAS dataset from ARL 7988ASC, in your program code use a fIlename statement such as "fIlename ARL 'A:ARL7988ASC';"; an infIle statement such as "infIle ARL Irecl = 276;"; and an input statement using the record format shown in Table 4. For DBF or DIF formats, use Proc DBF or Proc DIF.

As noted above, the data fIles can be accommodated fairly easily in dBASE, once the fIle structure is in place. Note, however, that dBASE interprets missing values, represented as blank fields, as zeros. If you include missing values in the calculation of averages, for example, the results will be incorrect.

Lotus requires some special adaptations of the data. One can use the Data ~arse command to translate an ASCII fIle into a 1-2-3 worksheet. But this method is rather cumbersome with a fIle with as many fields as ARL7988. A preferable method of accessing the data fIles in Lotus is

39

Page 40: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

ftrst to convert the fIles to DBF or DIF format, and then use Lotus's Translate utility to convert the DBF or DIF fIles to 1-2-3 worksheets. dBASE, for example, will save a fIle in DBF or DIF format. A caution: A DBF or DIF version of ARL7988, say, is fairly large. Although the Translate utility can convert it to a 1-2-3 worksheet, you are unlikely to be able to load the whole worksheet on a machine with 640K RAM. Another caution: If you use dBASE to create a DIF fIle and then use Lotus Translate to create a worksheet, Translate replaces missing values (spaces in the ASCII fIles) with the valuesfrom the preceding fields in the record. DBF fIles created by dBASE pass the missing values through Translate to the worksheet.

Those who want to use Lotus for manipulating the 1908-1988 data would therefore be well-advised to select years and observations in order to make a manageable worksheet that avoids memory problems. To select certain cases, use a database management program such as dBASE. Suppose, for example, that you have ARL7988 in dBASE and want to examine in 1-2-3 the last ftve years of data of, say, North Carolina (library number 5800) and Virginia (8900). Then at the prompt type ".copy to C:\DBDATA\NCVA for (INSTNO=5800 .or. INSTNO=8900) .and. (YEAR> =84)". This statement will save the 1984-1988 data for North Carolina and Virginia in a fIle called NCVA in subdirectory DBDATA of the root directory (that is, if you have created a DBDATA subdirectory). Note that the 1908-1988 data fIles offer help in selecting similar kinds of libraries: by YEAR; by TYPE (public, private, Canadian); by REGION (all South Atlantic libraries, e.g.); by MEMBYR (see Section III on uses of MEMBYR). One can also select by values of variables such as YOLS. A dBASE statement such as ".copy to C:\DBDATA\PEERS for (VOLS> =2000000 .and. VOLS< =3000000) .and. (YEAR = 88)" creates a fIle PEERS comprising the 1988 data for only those libraries with 2-3 million volumes.

As one check on whether your program has read the data fIles correctly, compute averages (means) of selected variables and compare them with the following values for beginning and ending years of each of the four data fIles:

No. of No. Year Variable Cases Missing Average

08 VOLS 12 0 107,425.42 62 STOTEXP 42 0 1,303,5%.71 63 VOLS 63 0 1,406,632.94 78 PHDFLD 94 0 50.09 79 VOLS 98 0 2,227,202.67

40

Page 41: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

88 75 (NONV) 88 (NONV)

FAC VOLS TOTEXP

2. Apple Macintosh Version

107 11 12

o o 2

1,309.23 3,291,317.18

39,055,281.80

The data ftles on the Apple Macintosh disk are stored in a StuffIt archive ftle. The freeware program UnStuffIt is also included on the disk. Double clicking on the UnStuffIt icon will run that program. Open Archive (under File) will open the file ARL text.sit which contains the four data files. Select the data file you want from the list and click Extract. A standard file dialog will appear so that you can select the destination for the file. Save will put the file where you want. These steps can be repeated until you have the data desired.

The data files are tab-delimited and have been tested on Microsoft Excel and File. Both programs accept the data cleanly. There should be no problem with using other spreadsheets, database managers, or word processors so long as they have the capacity to handle files of this size and with tab delimiters.

Also included on the Macintosh disk are the forms for Microsoft File. The user will probably want to customize them, changing field formats, appearance, etc. It is necessary to duplicate these files and then rename them before loading any data into them. This not only preserves them as "templates," but also avoids the problems engendered by File's lack of a "save as ... " command. The File forms are also included in the StuffIt archive.

3. ACRL Statistics

The Association of College and Research Libraries has collected and published data for the U.S. and Canadian university libraries that are not members of ARL. The data are for the years 1978-79, 1981-82, 1983-84, 1985-86, and 1987-88, and have recently been issued in machine-readable form.8 Because ACRL has followed the ARL statistics questionnaire for most years of its data collection, the ACRL data are compatible with the ARL data represented here.

There are a few minor differences between the ACRL and ARL machine-readable files: (1) Illinois, Chicago is INSTNO 3490 in the ARL data and 3510 in the ACRL data. (2) TOTSTF in ARL = PRFSTF + NPRFSTF. In ACRL TOTSTF = PRFSTF + NPRFSTF + STUDAST (which is the formula for ARL's TOTSTFX, a variable that does not appear in ACRL). (3) ACRL uses three differing record formats (one for

41

Page 42: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

1979-1984, one for 1986, and one for 1988). ARL uses a different format for 1963-1988.

Otherwise, the ARL and ACRL data are fully compatible. ACRL institutions are numbered according to a scheme consistent with ARL's. If you append, say, the ACRL 1988 data to the ARL 1988 data, and sort by INSTNO, the resulting list of libraries will be in alphabetical order. Except for TOTSTF, all variables common to ARL and ACRL have the same meanings. (Note, however, that ACRL includes a large number of variables not in ARL.) This consistency means that ARL and ACRL fIles together offer coverage of the last ten years of statistics of u.s. and Canadian university libraries.

42

Page 43: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

VI. Appendix: Emendations to the Printed Data

The machine-readable data for 1908-1988 are intended to reproduce the printed data faithfully, except for the inclusion of errata reported to ARL and the representation of missing data values by blank fields (or by periods in the Apple version). These kinds of silent emendations are discussed in section IV above. All other emendations to the printed data are listed below. In this list a blank in the "Printed Version" or "Machine­Readable Version" columns indicates a missing value.

1. Emendations to LAW, MED

Machine-Printed Readable

Years Library Variable Version Version

1976-79 LIBRARY OF CONGRESS lAW Y 1976-85 NATL. LIBRARY OF MEDICINE MED Y 1977 HARVARD lAW Y 1977 HARVARD MED Y 1978 BROWN MED Y N 1978-79 NEW YORK STATE LIBRARY lAW Y 1978-79 NEW YORK STATE LIBRARY MED Y 1979 BROWN MED Y N 1980-81 NATL. AGRICULTURAL LIB. lAW Y N 1986 HARVARD lAW Y 1986 HARVARD MED Y 1986 LIBRARY OF CONGRESS lAW Y

2. Emendations to VOLSADN

Printed Machine-Readable Years Library Version Version

1968 TUlANE -1,527 1973 SYRACUSE -101,166 1975 NATL. LIBRARY OF MEDICINE -14,042 1976 COLUMBIA -38,569 1977 NEW YORK -26,462 1985 CHICAGO -133,421

3. Emendations to TOTSTFX

Printed Machine-Readable Years Library Version Version

1975 LIBRARY OF CONGRESS 1,649 1978 LIBRARY OF CONGRESS 5,230 1978 NATL. AGRICULTURAL LIB. 165

43

Page 44: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Printed Machine-Readable Years Library Version Version

1978 NATL. LIBRARY OF CANADA 469 1978 SMrrHSONlANINSTTIrrnnON 104 1979 NATL. AGRICULTURAL LIB. 200 1980 LIBRARY OF CONGRESS 5,155 5,095 1984 LIBRARY OF CONGRESS 5,136 1984 NATL. LIBRARY OF CANADA 541 1984 NEWBERRY LIBRARY 113 1984 NEW YORK STATE LIBRARY 200 1984 SMrrHSONlAN INSTTIrrnnON 107 1985 LIBRARY OF CONGRESS 5,099 1985 NEWBERRY LIBRARY 122 1986 NATL. LIBRARY OF CANADA 549 1987 NATL. LIBRARY OF CANADA 530 1987 NEWBERRY LIBRARY 104 1988 NATL. LIBRARY OF CANADA 521

4. Emendations to TOTEXP

Printed Machine-Readable Years Library Version Version

1963 ILLINOIS, URBANA 2,748,662 1963 NEW YORK 1,052,309 1963 NORTHWESTERN 1,202,816 1963 YALE 2,126,067 1964 CORNELL 3,339,734 1964 NORTHWESTERN 1,390,199 1964 RUTGERS 1,209,131 1964 YALE 2,555,731 1965 CORNELL 3,287,058 1965 NORTHWESTERN 1,438,691 1967 JOHNS HOPKINS 1,282,984 1971 INDIANA 5,589,260 1972 INDIANA 5,870,200 1975 BOSTON PUBLIC LIBRARY 10,813,438 1975 'CENTER FOR RESEARCH LIBS. 958,284 1975 HOWARD 2,596,678 1976 BOSTON PUBLIC LIBRARY 10,956,499 1976 CENTER FOR RESEARCH LIBS. 1,576,332 1977 BOSTON PUBLIC LIBRARY 10,654,532 1977 CENTER FOR RESEARCH LIBS. 1,249,898 1978 BOSTON PUBLIC LIBRARY 10,216,357 1978 CENTER FOR RESEARCH LIBS. 1,447,474 1978 NATL. LIBRARY OF MEDICINE 8,102,000 7,731,000 1979 BOSTON PUBLIC LIBRARY 10,290,889 1979 NEWBERRY LIBRARY 2,548,534 2,548,533 1980 BOSTON PUBLIC LIBRARY 11,721,583 1980 SMrrHSONIAN INSTITUTION 2,981,328 2,980,828 1981 BOSTON PUBLIC LIBRARY 11,954,331 1982 BOSTON PUBLIC LIBRARY 11,507,139 1983 BOSTON PUBLIC LIBRARY 11,247,441 1983 NEW YORK PUBLIC LIBRARY 16,833,557 1984 BOSTON PUBLIC LIBRARY 12,449,904

44

Page 45: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

Printed Machine-Readable Years Library Version Version

1984 NEWBERRY LIBRARY 3,571,615 1984 NEW YORK PUBLIC LIBRARY 18,116,580 18,584,580 1985 CAl'JADA INST. FOR SCrrnCH. 15,519,121 15,492,121 1986 BOSTON PUBLIC LIBRARY 8,454,137 1986 LIBRARY OF CONGRESS 247,797,724 241,456,120 1987 BOSTON PUBLIC LIBRARY 24,191,531 1988 CANADA INST. FOR ScrrnCH. 19,164,198 1988 NEWBERRY LIBRARY 5,192,725 1988 NEW YORK PUBLIC LIBRARY 31,565,800 31,334,877

5. Emendations to Other Variables

Machine-Printed Readable

Years Library Variable Version Version

1979 NATL. AGRICULTURAL LIB. STUDAST 0 1980 LIBRARY OF CONGRESS STUDAST 60 0 1981 BOSTON PUBLIC LIBRARY EXPBND 0 1982 BOSTON PUBLIC LIBRARY EXPBND 0 1986 BOSTON PUBLIC LIBRARY EXPBND 0 1986 CORNELL FAC N/A 1986 LINDA HALL LIBRARY SALSTUD U/A 0 1986 NATL. AGRICULTURAL LIB. EXPSER 0 1986 NATL. AGRICULTURAL LIB. EXPOTH 0 1986 NATL. LIBRARY OF MEDICINE SALSTUD N/A 1987 BOSTON PUBLIC LIBRARY EXPBND 0 1987 GEORGIA MONO N/A 1988 NEWBERRY LIBRARY EXPBND N/A

45

Page 46: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

VII. Notes

1. Robert E. Molyneux, The Gerould Statistics, 1907/08-1961/62 (Washington, D.C.: Association of Research Libraries, 1986). A version of the Gerould statistics with revisions was issued by Princeton as College and University Library Statistics for 1920-1944 (available from University Microfilms as Books on Demand AG1-0P30634); and the Gerould Statistics for 1945-1962 were issued by University Microfilms as Special Film No. S-338, with the title Princeton University Library Statistics.·

2. The successive titles of the series are:

1962-1963: Association of research libraries statistics 1964-1974: Academic library statistics 1975- ARL statistics

3. Kendon Stubbs and David Buxton, Cumulated ARL University Library Statistics, 1962-63 through 1978-79 (Washington, D.C.: Association of Research Libraries, 1981).

4. See Stubbs and Buxton, pp. xiv-xvii.

5. Robert Molyneux, A CRL Academic Library Statistics, 1978/79-1987/88 (Chicago: Association of College and Research Libraries, 1989).

6. Quoted in W. W. Greg, "The Rationale of Copy-Text," Studies in Bibliography, III (1950-51), 20.

7. Quoted in Kendon Stubbs, "Lies, Damned Lies, ... and ARL Statistics?" Minutes of the 10Bth Meeting, Association of Research Libraries (1986), 83.

8. Molyneux, ACRL Academic Library Statistics.

46

Page 47: RESEARCH LIBRARY STATISTICS 1907-08 THROUGH 1987-88 · 2012. 2. 24. · Research Library Statistics, 1907-08 through 1987-88 includes 3 MS-DOS 5.25" 360K diskettes, 2 MS-DOS 3.5"

NOTE TO USERS OF LOTUS, QUATTRO, EXCEL, AND OTHER SPREADSHEETS

The data files on the enclosed DOS diskettes cannot be directly imported into spreadsheets such as Lotus or Quattro. These are large ASCII (or "flat") files; and for the 1963-1988 data each record is 276 spaces wide. Lotus and Quattro have Import functions for ASCII fIles, but Lotus does not allow a width greater than 240, and Quattro Pro has a limit of 254. Because of the size of the files, moreover, PC's without extended or expanded memory may not be able to accomodate a complete file such as ARL7988 in Lotus or Quattro format. (A Lotus .WK1 version of ARL 7988, for example, is about 540,000 bytes in size.)

In order to work with these files effectively in spreadsheets, users should first import them into a database management program such as dBASE, Clipper, FoxBASE, or Paradox; and from these programs output selected years and libraries in Lotus, Quattro, or other spreadsheet formats. See pages 39-40 for further information on using database management programs to create spreadsheet files. The ASCII files can also be imported into and manipulated with a wide range of statistical programs, such as SAS, SPSS/PC, SYSTA T, and others. Users with sufficiently powerful hardware/software configurations to accomodate large spreadsheet files may want to employ a file transfer utility such as DBMS/COPY.

The fIles on the Apple Macintosh diskettes can be imported directly into Excel. Because of the size of the files, depending on the type of Macintosh a user has, importing the files may require as much as half an hour or more. Unless the resulting spreadsheet is then resaved in "normal" (as opposed to "text") format, it will take the same amount of time every time the file is called up.

47