educational program: intro to hcs/hca image and data analysis · 2019-06-25 · educational...

Post on 28-Jul-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

High Content 2016September 12th-14th3rd Annual Conference

Joseph B. Martin Conference Center at Harvard Medical School, Boston, MA

High Content 2017September 13th-15th4th Annual Conference

San Diego Conference Center, San Diego, CA

Educational Program:

Intro to HCS/HCA

Image and Data Analysis

Mark-Anthony Bray, Ph.DNovartis Institutes of BioMedical ResearchCambridge, Massachusetts, USAmark.bray@novartis.com

High Content 2018September 18th-20th

5th Annual Conference

Joseph B. Martin Conference Center, Boston, MA

The Basic Skill Sets for an HCS Laboratory

|○○○○ | DDMMYY1 MEAN.VarInten.CMFDAMEAN.Dif f IntenDensity .CMFDAMEAN.Av gInten.CMFDAMEAN.FiberLength.CMFDAMEAN.NeighborMinDist.CMFDAMEAN.IntenCoocContrast.ActinMEAN.SpotFiberAv gArea.ActinMEAN.SpotFiberTotalArea.ActinMEAN.TotalInten.CMFDAMEAN.VarInten.TubulinMEAN.Dif f IntenDensity .TubulinMEAN.Av gInten.TubulinMEAN.TotalInten.TubulinMEAN.FiberAlign1.TubulinMEAN.NeighborAv gDist.Actin.TubulinMEAN.NeighborVarDist.Actin.TubulinMEAN.Entropy Inten.TubulinMEAN.IntenCoocEntropy .ActinMEAN.Entropy Inten.ActinMEAN.IntenCoocEntropy .TubulinMEAN.MemberObjectAreaDif f .DAPIMEAN.VarRadialInten.ActinMEAN.MemberAv gTotalInten.DAPIMEAN.TotalInten.DAPIMEAN.MemberAv gAv gInten.DAPIMEAN.NeighborVarDist.TubulinMEAN.Av gInten.DAPIMEAN.MemberCount.DAPIMEAN.MemberAv gConv exHullPerimRatio.DAPIMEAN.Av gRadialInten.TubulinMEAN.EqSphereArea.CMFDAMEAN.Area.CMFDAMEAN.EqEllipseProlateVol.CMFDAMEAN.EqSphereVol.CMFDAMEAN.NeighborMinDist.ActinMEAN.EqCircDiam.CMFDAMEAN.Length.CMFDAMEAN.Width.CMFDAMEAN.EqEllipseOblateVol.CMFDAMEAN.SpotFiberCount.TubulinMEAN.NeighborMinDist.TubulinMEAN.Entropy Inten.CMFDAMEAN.Perim.CMFDAMEAN.NeighborMinDist.Actin.TubulinMEAN.EqEllipseLWR.CMFDAMEAN.SpotFiberTotalArea.TubulinMEAN.ShapeLWR.CMFDAMEAN.NeighborVarDist.CMFDAMEAN.SkewInten.CMFDAMEAN.SkewRadialInten.TubulinMEAN.FiberWidth.CMFDAMEAN.SpotFiberAv gArea.TubulinMEAN.KurtRadialInten.TubulinMEAN.KurtInten.CMFDAMEAN.NeighborAv gDist.TubulinMEAN.ShapeP2A.CMFDAMEAN.Conv exHullAreaRatio.CMFDAMEAN.Av gRadialInten.ActinMEAN.SpotFiberCount.ActinMEAN.FiberAlign1.ActinMEAN.TotalInten.ActinMEAN.VarInten.ActinMEAN.IntenCoocContrast.TubulinMEAN.Angle.CMFDAMEAN.Av gInten.ActinMEAN.Dif f IntenDensity .ActinMEAN.MemberAv gConv exHullAreaRatio.DAPIMEAN.MemberAv gArea.DAPIMEAN.MemberAv gCircleDiam.DAPIMEAN.KurtRadialInten.ActinMEAN.NeighborAv gDist.ActinMEAN.FiberAlign2.TubulinMEAN.Conv exHullPerimRatio.CMFDAMEAN.MemberAv gShapeBFR.DAPIMEAN.ShapeBFR.CMFDAMEAN.NeighborAv gDist.CMFDAMEAN.NeighborVarDist.ActinMEAN.MemberAv gShapeP2A.DAPIMEAN.MemberAv gShapeLWR.DAPIMEAN.MemberAv gEllipseLWR.DAPIMEAN.VarRadialInten.TubulinMEAN.FiberAlign2.ActinMEAN.MemberObjectAreaRatio.DAPIMEAN.KurtInten.TubulinMEAN.SkewInten.TubulinMEAN.IntenCoocASM.TubulinMEAN.IntenCoocMax.TubulinMEAN.SkewInten.ActinMEAN.KurtInten.ActinValidObjectCountMEAN.IntenCoocMax.ActinMEAN.IntenCoocASM.Actin

* An Introduction To High Content Screening: Imaging Technology, Assay

Development and Data Analysis in Biology and Drug Discovery (2015), Haney,

S.A, Bowman, D. Chakravarty, A. Davies, A. and Shamu, C.E. John Wiley

Press, NY, NY (in production)

The HCS Laboratory

2

Plate Handler Robot HCS Imager

Plate Visualization / Image Analysis

Workstations

Image Analysis

Computer Cluster

Data Management

System

Network File Server

Network

Instrument Control

Workstation

The Wet Lab

Reagents, protocols,

assay optimization

Hardware and

Image Acquisition

Assay Types and Assay Development

Image and Data Analysis

* An Introduction To High Content Screening And Analysis Techniques:

Practical Advice and Examples, Haney, S.A, Bowman, D. Chakravarty, A.

Davies, A. and Shamu, C.E. John Wiley Press, NY, NY (in production)

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Images Contain A Wealth Of Information

http://www.microscopyu.com Image: Javier Irazoqui

Fundamental Steps

Making measurements,

feature extractionLENGTH, WIDTH,

CURVATURE, TEXTURE…

Result

Object detection, segmentation

(including 3D and tracking over time)

Preprocessing

Image acquisition

Object

classification,

interpretation,

recognition

Image Analysis Software Solutions

• Application modules

– Good for someone new to HCS, or just needs turn-key

solution

– Polished user interfaces, fast

– Often integrated with microscope hardware

– Validated, standard assays

– Canned approach: No detailed knowledge re: image

analysis needed

• Development environment

– Good for new assay development, more flexible approach

– Customizable assay design instead of pre-built solution

– Typically, combine modules into a workflow

– Higher “cost-of-entry”: Time involved to understand image

analysis details, language, scripting…

Image Analysis Software Solutions

• Commercial– PerkinElmer Acapella

– Definiens Tissue Studio

– Molecular Devices Metamorph

– GE InCell Analyzer

– Media Cybernetics ImagePro+

– Mathworks MATLAB

– Adobe Photoshop

– Etc

• Open-source– ImageJ/FIJI

– CellProfiler

– BioImageXD

– Icy

– Vaa3D

– ITK/VTK

– KNIME

– Etc

Not comprehensive!

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Object Identification

• Also known as segmentation: Partitioning an image into

regions of interest

• Step 1: Distinguish the foreground from the background by

picking a good threshold

• Foreground: Regions where I(x,y) > threshold T

Illumination Correction

• Nonuniformities introduced in the optical path of the

sample, microscope, and/or camera

Example: Uneven illumination from left to right

– Can lead to inaccurate segmentation and measurements

– Cell at (a) is brighter than (b) even if cells have same

amount of fluorescent material

(a) (b)

Carpenter et al, Genome Biology 2006, 7:R100

Illumination Correction

• Recommendations

– Create new illumination correction if switching microscopes

– Perform per-plate correction

– Perform per-channel correction, as absolute illumination intensities may differ between channels

Images from Carolina Wahlby

Input image Output image

Approximation of

backgroundAverage many images

Fit continuous function to result

or smooth heavily

Background Subtraction

• Top-hat (“rolling ball”) filtering

Image Thresholding

What is the best threshold value for dividing the intensity histogram into foreground and background pixels?

Here?

Or here?

Pixel values

Fre

qu

en

cy

Raw input

image

Thresholded

binary image

0: Background

1: Objects

Labeled objects

Colored ROI:

Connected

pixels

Pixel-Based Image Classification

• For images where a threshold cannot be found…

• Machine-learning tools can be helpful, e.g., ilastik

– User manually labels regions of image

– Suite of features are used to distinguish regions and create a classifier

Sommer and Gerlich, JCS 2013, 126:1

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Separating Touching Objects

• Step 2: Distinguish multiple objects contained in

the same foreground blob

• Once the foreground blobs have been identified,

what next?

– Thresholding is not sufficient to separate clustered or

touching objects

Watershed Segmentation

• Consider the image

as a surface with

basins….

••

••

http://www.svi.nl/watershed

Images from Carolina Wahlby

Separating Touching Objects

– Intensity-based:Works best if objects are brighter at center, dimmer at edges

– Shape-based: Works best if objects have indentations where objects touch (esp. if objects are round)

Peaks

2

1 2

Indentations

Identifying objects: Some options

1

1

••

••

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Identifying Cell Objects

• Nuclei more easily separated than cells– DNA markers are specific

– Yield good foreground/background contrast

– Uniform shape

• Identifying cells is more difficult– Available markers often lower

contrast

– Unclear boundaries between cells, depending on the cell type and culture conditions

Secondary Object Identification

• “Growing” the primary objects to identify cell boundaries

• Use segmented nuclei as “seeds” by using a cell stain channel

• Some assays do not require precise cell ID• E.g, is a protein located in

nucleus or cytoplasm?

• Produce proxy cells by growing nuclei by N pixels if no cell stain available

Identifying Subcellular Structures

• With appropriate markers, other

subcellular compartments can

be labeled

• These can be identified using

the same methods already

mentioned

• Consider using enclosing object

as mask for better pre-

processing, thresholding

• Make sure to assign subfeatures

to enclosing objects

|○○○○ | DDMMYY23

Pre-processing

Sub-object ID

Sub-object relation

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Measuring Object Counts

• Most common readout

– # of cells per image/well

– # of organelles per image/well

– # of organelles per cell

• Number of objects per image/well is often a useful readout for QC

purposes

Measuring Object Morphology

• Reduce an aspect of object shape to a single value

• Example features

– Area: Pixel coverage of object

– Perimeter: Length of object boundary

– Eccentricity: Object “oblongness”

– Major, minor axis length: Object elongation

– Form factor: Measure of compactness

– Zernike features

• Objects touching the image border should be excluded if shape is important

http://www.perkinelmer.co.uk/

Measuring Object Intensity

• Example features– Integrated (total) intensity: Sum of the

object pixel ∝ amount of substance labeled

– Mean, median, standard deviation intensities

– Lower/upper intensity quartiles

– Correlation coefficients between channels: Colocalization

• Make sure to illumination correct beforehand

• Related to the amount of marker at a pixel location

Images courtesy of Ilya Ravkin

Measuring Object Texture

• Determine whether the staining pattern is smooth or coarse at a particular scale

• Selecting the appropriate texture scale

– Higher scale: Larger patterns of texture

– Smaller scale: More localized (finer) patterns of texture

Virus Texture Dataset, http://www.cb.uu.se/~gustaf/virustexture/

Moffat et al., Cell, 2006, 124:1283

Measuring Location

• Cell or organelle location within image may be meaningful

• Example features– Distance from organelle to

nucleus, cell membrane

– Change in position often important in time-lapse imaging

Miller et al., PNAS 2003

Battich et al., Nat Meth 2013

Time-Lapse Analysis

• Very sensitive to problems in object

identification

• GIGO: Assay development, image

acquisition must be optimized for tracking

success

• Take note of mis-segmentations

especially for cell cycle, lineage studies

• Software

– Bitplane Imaris, Perkin-Elmer Volocity,

Molecular Devices Metamorph

– CellProfiler, FIJI, etc

|○○○○ | DDMMYY30

Schmitz et al. Nat Cell Biol 2010, 12:886

Measuring Clustering

• Characterization of spatial

relationships between

objects

• Example features

– Number of neighboring

objects

– Percent of the perimeter

touching neighbor objects

– Distance to the nearest

neighborhttp://www.perkinelmer.co.uk/

Combinations of Measurements

• Phenotype identification may be difficult if hand-

selecting from a limited measurement set

• Machine learning (ML) approaches can identify

phenotypes from a combination of measurements

Sommer and Gerlich, JCS 2013, 126:1

• Some measurements (e.g., texture) are hard to interpret as readouts but

are excellent fodder for ML approaches to downstream analysis

– See ML advanced elective for more

Outline

• The image as quantitative data

• Identifying the image foreground

• Splitting object clusters

• Identifying cellular compartments

• Measurement extraction

• Statistical analysis

Quality Control

– Focus imperfections,

incorrect exposures,

background problems,

artifacts

– Identify, eliminate

systematic aberrationsFocal blur Saturation artifact

• Ideally, QC should be performed at beginning of workflow

• Use automated measures, with option of manual vetting

– Machine learning approaches can be useful here

Sommer and Gerlich, JCS 2013, 126:1

Data Analysis

• What does this data set look like?

• Cytological profile, or Cytoprofile

• Shows all the measurements acquired– For each individual cell

– In every image

– In the entire experiment.

+1

0

-1

Cell #6111617

-.2 .7 -.1 0 .2 -.9

Data Normalization

• Used to remove systematic errors from the data

• Allows comparison of screening runs from different plates, acquisition times, etc.

• Ideally, results in:

– Similar measurement ranges observed across different wells with the same treatment

– Similar measurement distributions of the controls (positive or negative)

– Keep in mind the recommendations from Assay Development section!

• Common approaches

– % of control: Divide by mean of corresponding measurement from control

– % of samples: Divide by mean of corresponding measurement from all samples

– Z-score, robust Z-score: Transform to zero mean/median, unit variance/MAD

• Alternative approach: Normalized value = percentile within rank-ordered data

|○○○○ | DDMMYY36

Statistical Analysis Software

• Spreadsheets (e.g., Microsoft

Excel)

– Widely used because of familiarity,

– Unable to handle large screening

datasets

– Lack sophisticated analysis

methods

• HCS/HTS microscope vendors

often bundle data-analysis

functionality with hardware, image-

analysis software

http://www.essenbioscience.com

Statistical Analysis Software

• Specialized commercial tools

– Wide variety of products

– Often bundled with hardware

– Talk to vendors for more details

• Open-source tools

– KNIME

– CellProfiler Analyst

– Weka

– Bioconductor

• Custom scripts

– MATLAB

– R

– Python

Not comprehensive!

Summary: Fundamental Steps

Knowledge about

the application!

Making measurements,

feature extractionLENGTH, WIDTH,

CURVATURE, TEXTURE…

Result

Object detection, segmentation

(including 3D and tracking over time)

Preprocessing

Image acquisition

Object

classification,

interpretation,

recognition

Additional Resources

• Introduction to the Quantitative Analysis of Two-Dimensional Fluorescence Microscopy Images for Cell-Based Screening

– Ljosa and Carpenter, PLoS Computational Biology, 5(12), 2009

– DOI: 10.1371/journal.pcbi.1000603

• Biological imaging software tools

– Eliceiri et al, Nat Meth, 9(7), 2012

– DOI: 10.1038/nmeth.2084

• Assay Guidance Manual

– Introduction: http://www.ncbi.nlm.nih.gov/books/NBK100913

– Advanced methods: http://www.ncbi.nlm.nih.gov/books/NBK126174

Summary: The HCS Laboratory

41

Plate Handler Robot HCS Imager

Plate Visualization / Image Analysis

Workstations

Image Analysis

Computer Cluster

Data Management

System

Network File Server

Network

Instrument Control

Workstation

The Wet Lab

Reagents, protocols,

assay optimization

Hardware and

Image Acquisition

Assay Types and Assay Development

Image and Data Analysis

* An Introduction To High Content Screening And Analysis Techniques:

Practical Advice and Examples, Haney, S.A, Bowman, D. Chakravarty, A.

Davies, A. and Shamu, C.E. John Wiley Press, NY, NY (in production)

top related