managing image data for aquatic sciences - the best practices presentation
TRANSCRIPT
Claude Nozères
Science Branch, Québec Region
Fisheries and Oceans Canada
Maurice Lamontagne Institute
Overview
1. introduction: the guide (Tech. Rep. 2962)
2. image data: what is it about?
3. captures: preparations
4. metadata: why all the bother?
5. workflows: recipes for work
6. exports: archives & publishing
7. trends: comments on new tech.
8. questions: findings on the tour so far
afternoon: software demos & discussions
1. Introduction: background
Personal experiences – taking digital photos
of aquatic life since 2001
needed to document prey samples for marine
mammals, and film wasn‘t doing a good job
became aware of mixed information among users
○ frustrations were common when using either
consumer or industrial tools
○ by sharing experiences, our work may become
easier, and better quality image data is produced
Introduction: guide & tour
DFO‘s National Image Data Management
(NIDM) Working Group
fall 2010: began a ‗best practices guide‘ to assist
employees with their imaging work
mid-Dec. 2011, published the first full version:
Nozères. Tech. Rep. 2962, now online (WAVES)
Jan-Feb. 2012: tour of regions to introduce guide
○ the hope is that each site will then do a follow-up,
with advanced workshops, for their needs
Objectives: this talk1. to introduce a sample of common, but perhaps
misunderstood concepts in image data
2. to learn about your experiences with gear and software so we can share this with others in DFO
note: will also try
to include latest
information, not
in the guide
Headline: happy marine biologist
Keywords: scene, joke, smurf
Location: Belle-Isle
Category: personal
2. Image data – basic types
still image data (photo)
huge availability in consumer devices
well-established for industry & science but finicky?
moving image data (video)*
often consumer-oriented (family videos)
industrial applications: pricey and finicky?
information (metadata)
data about the image data
2008-08-07-
12:02:09...
*note: video is not discussed in this brief introduction – see guide for information
Why talk of files as ‗data‘?just another pretty picture
or an
aquatic species observation?Keywords: harbour seal, rock
Location: Sainte-Luce
Date: Sept. 8, 2009information
clearly visible
subjects
Image data: perceptions
‗If really so useful, we should all be doing it!‟
may end up generating stacks of fuzzy, dateless, unknown files = frustration
„I have enough science data to deal with‟
images as data may not be taken seriously
„I don‟t have time for more requirements!‟
learning about image data may be viewed as a time-waster instead of a work-saver
3. Capturing image data
camera settings
format (file type)
quality (lossy compression)
size (dimensions....)
special topic: geotagging (GPS data)
Camera settings: file
formats JPG (8-bit) is default, or only option for many good, but ‗baked‘ (limited for image editing)
RAW (10, 12, 14-bit) for advanced cameras require post-processing with RAW software
sometimes capture both ‗RAW+JPG‘
○ view JPG right away, store RAW for later edits
TIF is occasionally available (8 or 16-bit) microscope & tethered cameras, scanners
good choice for image analysis (16-bit)
‗baked‘ like JPG: harder to correct for whiteness
Camera settings:
‗quality’ (for JPG)
Lossy compression:
how much detail is to be
discarded in JPG?
Select quality:
„basic, good, fine, v.fine‟
= low to high quality
Lossless compression:
no data loss, no need to
set ‗quality‘ (RAW, TIF)
Settings: channels & bits
Channels Grayscale has 1 channel (black)
RGB (for screen) has 3: Red, Green, Blue
CMYK (for print) has 4: Cyan, Magenta, Yellow, blacK
Bits: levels, or gradation ‗steps‘ (in each channel) 1-bit = 21 = 2 values, on/off, black or white (like a fax)
8-bit = 28 = 256 values for tones (gray or colour images)
10, 12, 14, 16-bit = many thousands of tone levels
note: most monitors only display in 8-bit even if you can‘t see it, the data is there for analysis
Channels x
8-bit:
256 steps
black = 0
white = 256
Levels (tones)
16-bit:
65,536 steps
black = 0
white = 65,536
X
Why more bits matter
high-bit RAW & TIFF files have more tones
important in image analysis for feature (subject) discrimination, like plankton in a water sample
○ 16-bit grayscale may be preferred over 8-bit colour
the extra information enables powerful software editing (recover detail in light and dark areas)
○ JPG 8-bit can be also edited, but less dramatic
○ TIF at 8-bit has same limits (16-bit allows more)
○ RAW is >8-bit (e.g.,10-14)
Note: colour scanners may refer to 24-bit or 48-bit (3x8 or 3x16)
Settings: white balance
auto-white balance may be accurate, but sometimes better when set to conditions:
sunny, cloudy, shade, incandescent, fluorescent
JPG & TIF are ‗processed‘ files with their ‗whiteness‘ (white balance) set at capture
like a ‗Polaroid‘ instant photo: limited edits
RAW has metadata suggesting the setting, but is not fixed: can redo after capture
similar to a film negative: ‗reprocess it‘
Camera settings: white
balancecorrected file for white
Background should be white – clicked on it with a correction
tool and white balance was adjusted
RAW file: default capture
under fluorescent lights
Camera settings – size
...file size, image size (resolution), image re-sizing (pixel numbers),
pixel density, sensor size, sensor photosites, photostitching...
Camera settings – size
Image resolution
web or small: good for onscreen viewing
large (2 to 5 MP): good for regular prints
full-size (usually about 8 to 16 MP): archives
note: RAW is usually a full-size capture
Why choose for less than ‗full-size‘?
digital zoom (like cropping) sometimes handy
situations when a large image is a burden
○ documenting labels, geotagging, emailing
○ caution: set back to full-size afterwards
2600x1900
1600x1200
5 MP
2 MPweb
Size: pixels vs. files
Settings for size (or resolution) are about image
dimensions—how many megapixels (MP), not
the computer file size in Kilobytes, Megabytes
(KB, MB)
1600 (across) x 1200 (high) pixels = 2 MP
but file size will vary by format & compression
JPG with high compression = small file (68 KB)
TIFF with no compression = large file (5800 KB)
○ TIF with lossless compression of this image = (70 KB)
blank test
image 2 MP
Size: dimensions vs. density
on the computer: resizing is increasing or
decreasing the number of pixels (dimensions)
but sometimes we say we ‗resize‘ for print
really just setting pixel density (dots per inch: dpi)
image size (number of pixels) has not changed
larger dots: 72 dpi
upsized
(more pixels)original)smaller
(less pixels)
1600x12003600x2400
800x600
smaller dots: 300 dpi
Print viewing Screen viewing
Of sensor sizes &
megapixels Sensor size: physical dimensions (mm)
SLR cameras have large sensors
compact cameras have tiny image sensors
Photosites : density of sites on the sensor
two cameras may have the same resolution, but
the 12 megapixels of the SLR are over a much
wider area (the larger sensor) than the 12
megapixels on a small-sensor compact
Sensor sizes
Medium format &
full-frame 35mm
niche markets ($$)
slow development
smaller sensors
are versatile
extremely
competitive
intense
development
most
common
new Canon G1X
new Nikon 1
20-80 MP
12-24 MP
12-24 MP
5-16 MP
Sensor sizes
35 mm & Medium format ( & larger) are useful in aerial surveys (e.g., marine mammals) extreme level of fine, clean detail & tones
great for distances; macro work is trickier, bulky
most biology work is done with compacts or smaller (APS) SLRs: simpler, easier to use) compacts for macro work: many can do 0-10 cm
getting ‗pretty good‘ results: use software processing to beat physical limits, reduce noise
○ not ‗fakery‘ but sometimes undesired (see example later)
Capture tips: boost image size‘photostitching’
Capture tips
Consult
examples:
online image
galleries contrasting background
(white piece of plastic)
- gray and black also good
ruler or object in
view for scale
optional: have a colour card
(or something white) in view,
to correct for white balance
Capture: Geolocation
some cameras have internal GPS to embed coordinates & correct time zone date mostly in still cameras, but also some video (rare)
○ note: smartphones geotag both photos & videos
other cameras can have their images tagged with external data using, for example: 1) geotagged image at same location (e.g. smartphone)
2) GPS track and timestamp of image
○ note: image file must have correct clock time
○ tip: take a photo of the time on a GPS screen, then examine that photo‘s capture time info. to determine correction/adjustment for camera clock
Geolocation – image tagging
Camera with telephoto lens
(but no GPS)
Smartphone photo (tagged with GPS)Smartphone map (shows AIS)
Geolocation – image
taggingSmartphone photo
(tagged with GPS)
load into the geotagging software the
tagged photo with untagged photos
taken from the same location
SLR zoom photo
(geotagged w/phone image)
Keywords: ship, transport
Location: Sainte-Flavie
Category: personal
Geolocation – GPS track
sync record a GPS track log on an external device
log while taking camera images
later, download images and the GPS track
into geotagging software
the capture time of the photo will be used to
determine its position at that time on the
GPS track (‗sync‘)
embeds the coordinates into image file
NOTE: this is an example of image data information
(metadata), and not about image quality
4. Image (file) metadata
tags why the fuss over metadata? we may do ‗tagging‘ in order to be able to locate,
use, and credit the image files using the tags
where is the image metadata? camera files have well-known, standard places to
store this special text information
other image data, or non-standard information, may be entered in catalog files in a database system
do I need to do manually add all these tags? some are automatically included by the camera, such
as date, time, camera model (and GPS, if available)
Metadata tags: suggestions
Common fields for tagging images:
Filename: unique name (e.g, date-####.JPG)
Title: name for photo (but often for ID #)
Headline: short phrase about content
Description: more info. about content
Keywords: species name, subject
Location: place or station name
Creator: photographer‘s name
Filename:
20111014_IMG_1387.JPG
Title (catalog no.):
9682
Headline (quick describe):
Arctic isopods
Description/Caption (text on
paper label): Hand-collected
Mesidotea sabini from
Causeway at low tide, held in
an aquarium for one day
Keywords: Saduria sabini
Location: Frobisher Bay site 9
Creator: Claude Nozères
Tag example
useful, but not often done
Make sure your metadata makes sense to usersGood: added metadata
tags can be as you like
Bad: added metadata
tags can be as you like
Try to follow examples
of others, e.g. IPTC,
MWG, the DAM book
(some rules exist, but
most are open-ended)
Example:
Creator: unknown
Posted on blogs since 2010
Was able to find it using the
visible text in a Google search
How would you tag this image?
Title? Caption? Keyword?
Title: Rappahannock River,....
Description: (a literary quotation ?)
Source: Mike Ashenfelder, 2011
Metadata: retaining &
reading Older or simpler software may be unaware strip away camera metadata (capture date, etc)
Not all image browsing software play fair Apple, Microsoft, and Google are all competing to
make easy-to-use, popular tools
sometimes do hidden & proprietary processing ‗for your benefit‘ (automatically), which may be to the detriment of ‗industry-standard‘ metadata tags
recent examples: face recognition (all), geotagging (Windows Live), stripping of current tags (IPTC) with retired fields (Apple Aperture, iPhoto)
basic fields are easily read by most
advanced fields may be handy in projects
custom fields are available, but make sure
your users are aware of their existence
Metadata: summary for use
key lessons:
1) adopt a style and be consistent
2) let your users know what to expect
3) be vigilant for software behavior
5. Image data workflows
can we do editing and tagging without
worrying about how it works?
people want ‗recipes‘, or workflows
see guide no. 2962 for some examples
image data protocol examples
○ case studies for different work scenarios in
aquatic sciences
image data software examples
○ practical examples using software tools
Guide workflows: for discussion
the guide is not a
fixed set of rules
rather it is a list of
suggestions from
recent work
which ways of
working may be
easier (& better)
than others?
source: XKCD
(adapted
from The
Oatmeal)
Image
Data
work
it doesn’t
have to
hurt when
using the
right tools
(adapted
from The
Oatmeal)
Image
Data
work
it doesn’t
have to
hurt when
using the
right tools
Claude has
Quote overheard yesterday*
“How do you love Photoshop? Like someone
loves their wife,...or their cousin...or?”
“I love Photoshop like people love their kids –
no way to get rid of it, so I have to love it”
*Macworld Podcast – Less than Perfect: App Design
Image data work: a tale of 2
tools Adobe Photoshop (PS)....20+ years
classic tool for editing and....everything!
○ most folks only use it for a few tasks
Adobe Lightroom (LR)....5+ years
revolutionary workflow tool, now matured
○ ‗95%‘ of my photo work is now done inside LR
○ extra functions available with shareware plugins
Newsflash! Jan. 2012 – LR Public Beta 4:
video editing, geotagging, photobooks
Image data work: managing
tools Browsers: ‗find‘ your images on a workstation Windows Explorer (default – very limited)
Google Picasa (easy, basic, free)
Photoshop Bridge (full browser & metadata editor)
Photoshop Elements Organizer (new: object searching)
Cataloger & image editor Adobe Lightroom (workstation, not network use)
Catalogers Phase One Media Pro (workstation; free catalog reader)
Damnion, Canto Cumulus (network/server)
demonstrations this afternoon (bring your laptop)
6. Exporting: final work stages
After capturing, tagging, editing images,
we want to:
store the originals & edits (archiving)
distribute copies (publishing)
Exporting: archives
Ideally, this is about final edits in best
quality with metadata tags that are stored
securely in multiple locations and media
3-2-1 approach is recommended (Krogh)
have 3 copies (original & 2 backups in rotation)
store on 2 kinds of media (hard drive, DVD)
keep 1 off-site (not all stored same place)
This is an area that NIDM is working on: how to consolidate and preserve.
Large projects are likely good, but smaller ones may need advice
Exporting: galleries & print
may send re-sized versions:
800 pixel 72 dpi JPG is fine for web galleries, and especially for email
more-pixels, but at 150-300 dpi is for print (the density is important for clear prints)
for public viewing on web, review the file metadata & edit if desired
location, names, comments may be seen
edit in DAM (Bridge, MediaPro, LR)
Publishing – web CaRMS (Canadian Register
of Marine Species)
- online taxonomic resource with editors
- also has a user-added image gallery
- see Kennedy et al. Tech. Report
- note: camera metadata is visible
camera
metadata
added on
website
Publishing – web
DFO has several image
gallery projects
Coast Guard, SLGO,
CaRMS, CMB, others?
Groups may join a
large, existing gallery
Flickr is very popular and
does some metadata
used by EOL, BHL, GBIF
7. Trends: new camera
types before, chose either a digicam or a SLR small device & average images, or big rig & great
was a demand for quality and compact at same time
‗mirrorless interchangeable lens‘: MILC Panasonic, Olympus, Sony, Nikon, Pentax
2011: new disruptive trends in compacts ‗retro-style‘: Olympus Pen, Fujifilm X100, X10...
‗ultra-modern‘ camera phones: iPhone 4S
2012: light field (Lytros) – ‗refocus anytime‘
large-sensor
MILC
large-sensor
fixed compact
tiny-sensor
fixed compact
lightfield (Lytros)
New tech: changing the
game editing software
new camera types
high-sensivity sensors (lowlight)
solid-state memory (‗flash‘)
cheap hard drives
network storage (‗cloud‘, e.g., Dropbox)
tablets & tactile displays (iPad, Cintiq)
not just fashion: new types may lead to better image
data and much improved workflow (easier & faster)
New tech: science benefits
lowlight sensors: reduce need to carry lights fewer noisy, blurry (slow shutter) shots
compacts: easier to carry & use capture events more often in the field
SSD: insensitive to ship vibration, magnets use on underwater towsleds, aerial surveys
large drives: save all, do backups don‘t bother to delete or waste $$ time reviewing
cloud services: share files with colleagues don‘t burden email with huge attachments
tablets: field guides, rapid data entry & review
Newer is not
always betterPentax had long line of WP cameras,
but recent models not good indoors
Sony & Panasonic are new entrants,
but are giving much better files
Late 2011 DPReview test: indoors w/flash photo
clean detail
clean detail
mushy when indoors
new Pentax Optio when used indoors:
mushy photo—hard to identify
Canon Powershot: clean detail,
easier to identify small organisms
Teleost Aug. 2011
Resources – websites
The Luminous Landscape – practical opinions
The DAM Book forum – ―real DAM answers‖
dpBestflow.org – best practices & workflows
JISC Digital Media – advice & examples
Digital Photography Review (dpreview.com)
WHOI HabCam – underwater photo
SERPENT projet – underwater video
CARMS Photogallery – species images
Resources – books
The DAM Book, 2nd edition, Krogh
Photoshop CS5 and Lightroom 3: A Photographer‟s Handbook, Laskevitch
Adobe Photoshop Lightroom 3: the missing FAQ, Brampton
Photographic Multishot Techniques, Steinhoff & Steinhoff
The VueScan Bible, Steinhoff
On Digital Photography, Johnson
Resources – documents
(PDF) GBIF Community Site: Best Practices Manuals
Federal Agencies Digitization Initiative (FADGI),
Still Image Working Group
Metadata Working Group (MWG)
IPTC Image Metadata Handbook
Establishing best practices for marine biological
data, Seeley et al. 2008, COWRIE
CaRMS photogallery user guide, Kennedy et al.
2011. DFO Tech. Rep. 2933
Resources – software utilities
Ingestamatic, Photographer‘s Toolbox,
JFriedl‘s Lightroom Goodies, Photo
Mechanic, DVMP, CatDV, RoboGEO,
Cineform, Clipwrap, Helicon Focus,
CDFinder, CDWinder, NIS Elements
freeware: Picasa, ImageJ, VLC, VARS,
ExifTool, IrfanView, Zooscan, Shotwell,
Handbrake, Contour Storyteller, MPEG
Streamclip
Obj. 2: learning – Sault-Ste-
Marie Otolith microscopy w/Image Pro (5 MP) good file naming, 3-2-1 storage; might try tagging
Scanning historical slides of activities (size?..) all notes are entered in filename – need to rethink this
Underwater video for lamprey control (volume?) proprietary DVR: take video feed over RCA & capture
Underwater dam inspection using a 2 m pole want live view & record; suggest using 2 dif. cameras
Photo folder on local server (8 GB) do temp. catalog to browse, then do perm. catalog
Obj. 2: learning – Nanaimo
Otoliths: want to overlay 2 images, & dots need 3rd party tools (Photoshop, ImageJ)
Import prior analyses (keywords into LR) LR plugin (Syncomatic: based on filenames)
Reading catalog without full software LR: not usually. ExMedia/Media Pro: Yes
Can we use alternative ingestion (import) tools? Yes, Photo Mechanic, Ingestamatic may be useful for high
volume, batch file entry (e.g., marine mammal surveys)
Easy way to get started and using tools like LR? Various resources – our guide is an example, but we still
need a forum or other place to post experiences and tips
Obj. 2: learning – Burlington
Q‘s does DFO have a site licence for this software?
how to distribute a catalog on the network?
can I use custom annotation fields in a catalog?
what kind of scanner to archive histology slides?
Flowcam produces a composite of plankton shots in
sample: how to manage?
Nikon imaging microscope produces custom files –
how to manage?
How to transfer hierarchical folder names into
annotation fields? ...and more!
Obj. 2: learning – St.
Andrews geomatics & video lab: of screens & mice
had a quality monitor, but not great for viewing
charts or when using a mouse and keyboard to
trace habitats at same time as viewing images
solution: use the right display for different work:
1) HDTV for video (1920x1080 pixels)
2) 27in NEC for photos (2560 x1600 pixels)
3) 24in tactile display (Wacom Cintiq) for tracing
habitat classifications—more efficient
Obj. 2: learning – St.
Andrews reusing legacy & custom equipment
big HDV camcorder, with $30K UW housing
○ don‘t want to buy a new camera & $$$ housing
solution: HDMI video out to flash memory cards
result: instant digital video (no tape playback to
import), and higher quality (original video
capture, not compressed to fit HDV tape)
Obj. 2: learning – St. John‘s
need a place to obtain and learn more
want workshops, website forums....CMB?...
exchange files with remote fisher. observers
receive and send feedback on species ID images
cloud computing seen as a solution (Dropbox)
have to enable software updates
older software versions (>3 yrs.) are not aware of
current metadata and image file standards
want access to image files for regional guides
other regions may do ID books, want to do it here