Download - Batch metadata assignment to archival photograph collections using facial recognition software
Batch metadata assignment to archival photograph
collections using facial recognition softwareKyle [email protected]
Why should anyone care?
Current methods for assigning metadata are:
•Slow
•Difficult
•Error Prone
•Incomplete
2
Filing code stencil cards at the W. Atlas Burpee Company Library of Congress Prints and Photographs Division
A few challenges
• Libraries and archives use external systems to maintain metadata
• Archival images are huge and clunky to work with
• Metadata standards for image files are implemented inconsistently and weren’t designed with library needs in mind
3
Automation
• Process in bulk• Use metadata embedded
within the image
4
Fran Bilas Spence and Jean Jennings Bartik work on ENIACARL Technical Library
• Use the file system• Use consumer grade software as a force multiplier• Improve search engine visibility and simplify
migrations
What you need to get started
• A computer with the operating system of your choice
• Mad programming skilz
• Modest scripting ability (any language)
5
Image metadata demystified
$ head lovejoy-moskovetz_1923.tif
II▒▒▒@d▒▒F▒(1▒2▒▒ ▒▒]BI▒▒ ▒Ci▒Black and white photograph of Esther Pohl Lovejoy and Doctors Elliot and Moskovetz in Athens in 1923.▒▒['▒▒['Adobe Photoshop CS2 Windows2012:04:10 14:16:16<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
[a few lines deleted here]
<rdf:Description rdf:about=""
xmlns:tiff="http://ns.adobe.com/tiff/1.0/">
<tiff:ImageWidth>6046</tiff:ImageWidth>
<tiff:ImageLength>4880</tiff:ImageLength>
[a few more lines deleted]
<dc:subject>
<rdf:Bag>
<rdf:li>Lovejoy</rdf:li>
<rdf:li>Moskovetz</rdf:li>
</rdf:Bag>
</dc:subject>
6
Facial recognition
• People are an important access point• Provides authority control by nature• Identification of individuals helps
determine other details
7
Facial recognition primerWPI Transformations
• Extraction of faces simplifies manual identification
• Non-specialist staff can do more metadata work
Useful software
• Free Picasa software works great
8
• Stores person info in a combination of contacts.xml and .picasa.ini files
9
Since I know you’re wondering, it’s no good for…
10
.picasa.ini
[lovejoy-esther_portrait_nd.jpg]
faces=rect64(135a175de074cd8b),c0ef2256901bfbb6
backuphash=23375
[matarrazo-joseph_2001.jpg]
faces=rect64(3407026fe607ac00),c2c65f903b3150cb
backuphash=33
11
contacts.xml
<contact id="c0ef2256901bfbb6" name=“Esther Pohl Lovejoy" modified_time="2012-11-26T09:48:04-08:00" local_contact="1"/>
<contact id="c2c65f903b3150cb" name="Joseph Matarazzo" modified_time="2012-11-30T15:02:10-08:00" local_contact="1"/>
12
Adding metadata en masse
• Exiftool (available for all platforms) is incredibly handy
exiftool -XMP-dc:Subject+=‘My new heading’ myimage.tif
exiftool -XMP-iptcExt:PersonInImage+=‘Doe, John’ myimage.tif
• Notice the Dublin Core subject tag• DC doesn’t define people explicitly as subjects
so we used IPTC extensions here
13
14
Exiftool is useful for reading metadata
• Exif stores excellent technical metadata so it’s nuts to hand key this into other systems
• Usage is brain deadexiftool filename (Labeled display)
exiftool –X filename (XML)
exiftool –T filename (Tab delimited)
• Many powerful options
15
You need 3 image metadata standards
• Exif for technical metadata
• IPTC for many descriptive fields
• XMP for specialized information needed by archivists and librarians
16
A glimpse into the future
• Social metadata• Union catalogs contain better metadata than
local catalogs• Create richer and more accurate metadata
much faster and cheaper than is otherwise possible
17
18
Before going nuts on your photos…
Picasa can mess up existing metadata if you let it write tags (facial recognition doesn’t use tags)
You can create new tags, but don’t expect other software to read them
Facial recognition is a handy tool, but don’t use it as a crutch
Always test before performing batch metadata modifications or you may wind up blasting out existing metadata
19
Takeaways from this presentation
1. Facial recognition is easy with Picasa
2. Exiftool is incredibly useful for reading and writing image metadata
3. Learning to use embedded metadata is easy and makes too much sense not to do
20
Thank You!Kyle Banerjee