aiselections: computational techniques for matching faculty research profiles to library...
TRANSCRIPT
aiSelections: Computational Techniques for Matching Faculty Research Profiles to Library Acquisitions
Peter M. Broadwell – CLIR Postdoctoral Fellow, UCLA LibraryTimothy R. Tangherlini – Professor, Scandinavian Section and Department of Asian Languages, UCLA
Faculty research interests
Publication records also are very useful, if parsed well
Advanced faculty profiles: The Opus project at UCLA (envisioned)
Faculty CV analysis in RapidMiner
TF-IDF vectors of faculty interestsstudy,0.833333333history,0.48984375work,0.259709821center,0.142113095culture,0.218452381artwork,0.322321429southeast,0.012946429architecture,0.12797619Asian,0.066145833bronze,0.0375research,0.196875translation,0.055133929south,0.071428571design,0.071130952image,0.078683036museum,0.145982143place,0.071316964
ancient,0.03813244Chinese,0.036830357national,0.094791667light,0.028869048landscape,0.033854167project,0.065848214field,0.058928571state,0.052380952material,0.050892857review,0.069568452east,0.060639881Indian,0.023214286make,0.080729167life,0.034375discipline,0.113095238change,0.050558036Asia,0.02641369
society,0.047209821form,0.04077381cultural,0.070833333salt,0.010044643institute,0.07421875house,0.009821429china,0.023809524survey,0.085044643period,0.01875report,0.083035714space,0.032291667serve,0.040848214learn,0.081026786world,0.044270833gallery,0.025446429analyze,0.084077381build,0.016666667
Monograph records from WorldCat$worldCatQuery = “(srw.lc+all+N1*+or+srw.lc+all+N2*+or+srw.lc+all+N3*+or+srw.lc+all+N4*+or+srw.lc+all+N5*+or+srw.lc+all+N6*+or+srw.lc+all+N7*+or+srw.lc+all+N8*+or+srw.lc+all+N9*+or+srw.lc+all+NB*+or+srw.lc+all+NC*+or+srw.lc+all+ND*+or+srw.lc+all+NE*+or+srw.lc+all+NK*+or+srw.lc+all+NX*+or+srw.lc+all+TR*)+and+srw.yr>2005+and+srw.yr<2014+and+srw.mt+any+bks+and+(srw.la+all+eng+or+srw.la+all+fre+or+srw.la+all+ger+or+srw.la+all+ita+or+srw.la+all+spa+or+srw.la+all+dut+or+srw.la+all+por)+not+srw.mt+any+juvenile+not+srw.mt+all+ebk+not+srw.mt+all+elc”;
Process all ~160,000 results…
Monograph records from WorldCat
Cosine similarity between faculty profiles and book records
French Impressionism
Early Qing painting
Cosine similarity between faculty profiles and book records
French Impressionism
Early Qing painting
UCLA faculty
Cosine similarity between faculty profiles and book records
French Impressionism
Early Qing painting
UCLA facultyBook A
Book B
Cosine similarity between faculty profiles and book records
French Impressionism
Early Qing painting
UCLA facultyBook A
Book B
Cosine similarity between faculty profiles and book records
French Impressionism
Early Qing painting
UCLA facultyBook A
Book B
Evaluation data setsActual selections, Jan 2007 – Feb 2013
◦ 10,471 books in targeted subject areas, published after 2005 (subset of WorldCat data set, described below)
◦ 3,573 firm orders, 6,989 approval plan ordersCirculation records, Jan 2008 – Feb 2013
◦ 4,118 new, unique titles acquired after Jan 2007 circulated between Jan 2008 and Feb 2013
◦ This is 39.3% of acquisitions since 2007◦ Firm orders were 10% more likely to circulate◦ 606 books published since 2006 were borrowed via
interlibrary loan, many at no cost (intra-UC)All potential selections, published 2006-
2012◦ 130,042 unique titles (duplicates resolved) published
from Jan 2006, as returned by WorldCat query
Circulation of actual selections vs. simulated algorithmic selections
Circulation of actual selections vs. simulated algorithmic selections
Faculty profile matching:Applications and considerationsAppend a “faculty match” score
to vendor approval list entries◦Helps to target selections for the
short and medium term◦Not as useful for long-term, large-
scale collection developmentRefine subscriptions to online
periodicals and other resources◦Requires that online subscriptions
can be done a la carte, rather than via bulk packages
Faculty profile matching:Future directionsEnhance faculty profiles
◦Promising, due to growth in publication bibliometrics, faculty network analysis tools like Vivo and Profiles
Enhance resource profiles by obtaining more data◦For pre-publication monographs: unlikely◦Might be possible with online publications
Incorporate graduate student, undergraduate research interests
Combine circulation-based selection recommendations with faculty interest data
aiSelections: Computational Techniques for Matching Faculty Research Profiles to Library Acquisitions
Peter M. Broadwell – CLIR Postdoctoral Fellow, UCLA LibraryTimothy R. Tangherlini – Professor, Scandinavian Section and Department of Asian Languages, UCLA