a peek inside the carolina digital repository michael daines digital repository analyst unc –...

Post on 17-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Peek Inside theCarolina Digital Repository

Michael DainesDigital Repository Analyst

UNC – Chapel Hill

Goals

What’s in the repository?

What’s in the repository?

• 41158 images• 18671 texts (PDF, Microsoft Word, text files)• 11856 audio files• 1438 datasets• 54 video files

(As of July 17, 2013)

What’s in the repository?

• Research Laboratories of Archaeology35502 images (photographs and scans)

• Electronic Theses and Dissertations4035 PDFs

• BioMed Central1777 PDFs (articles)

(As of July 17, 2013)

How to show what we have?

“Peek”

https://github.com/UNC-Libraries/peek

How do we findinteresting images?

Cover pages?

Random pages?

How do we findinteresting images?

Query → Download → Split → Resize → Choose

Query, Download

Solr queryDownload public datastreams

Split, Resize

CoreGraphicsImageMagick

Choose

Initial set

2000 objects35855 images split

425 images for homepage

Further work

• Larger sample?• Automation?• Integration with repository?• Collaborative filtering?• Image classification?• No processing step?• A/V objects?• Bias?

Try it!

https://cdr.lib.unc.edu/https://github.com/UNC-Libraries/peek

https://github.com/UNC-Libraries/peek-data

top related