visualization and analysis of large data collections: a case study applied to confocal microscopy...

17
Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences, Amsterdam Pernette Verschure, Swammerdam Institute for Life Sciences, Amsterdam Robert van Liere, Center for Mathematics and Computer Science, Amsterdam

Upload: jayden-macleod

Post on 27-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

Visualization and analysis of

large data collections: a case

study applied to confocal

microscopy data

Wim de Leeuw, Swammerdam Institute for Life Sciences, Amsterdam

Pernette Verschure, Swammerdam Institute for Life Sciences, Amsterdam

Robert van Liere, Center for Mathematics and Computer Science, Amsterdam

Page 2: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

2

Motivation (1):

Context: cell biology experiments

Phenomenon captured using digital microscopy

Experiment characteristics:

• Biological diversity

• Not all biological parameters can be controlled

Many measurements needed

Page 3: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

3

Motivation (2):

Visualization and analysis of collections of data sets

• High variability

• Non-trivial information extraction (eg segmentation)

• Noise

Visualization Modes: Interactive vs Batch

• Interactive control+feedback vs static settings of parameters

• Time consuming vs multiple data sets processed simultaneously

Aim: combine advantages of Interactive and Batch Visualization

Page 4: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

4

Agenda

Biological Problem

• Chromatin structure and gene control

Visualization Problem

• Data collection description

• Analysis with visual summaries

Page 5: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

5

Chromatin Structure and Gene Control

Chromatin Structure

• Low level : DNA, nucleosomes, 30 nm fiber

• High level: fiber folding

Gene control

• Regulation of gene activity

Biological research question:

• Relation chromatin structure and gene control

• Is there, what is, when, etc....

Page 6: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

6

Experiment

Question: influence of Hetrochromatin protein 1 on chromatin structure?

Approach:

• Prepare collection of cells with a specific region

• Control group: target GFP to the region

• HP1 group : target GFP/HP1 to the region

• Observe regions with confocal microscope

Data analysis question:

• Identify and quantify the differences between control and HP1 group

Page 7: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

7

Collection of data sets

60 data sets (30 control group, 30 HP1 group)

Each data set: 512 x 512 x 32

Sample images:

• Control group (left)

• HP1 group (right)

Data analysis questions:

• Accurately detect region of interest

• Quantify region attributes (size, roughness, roundness, etc)

• What are the attribute differences in the control and HP1 groups ?

Page 8: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

8

Diversity of the collection

Page 9: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

9

Interactive Visualization of Collection

Advantages

• Control over visualization tools and parameters

• Segmentation

• Attribute computations

• Direct feedback

Disadvantages

• Laborious

• Error prone

Page 10: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

10

Batch processing of collection

Advantage

• All sets are processed automatically

• A-priori parameter settings

Disadvantage

• No feedback on the process

Page 11: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

11

Visual Summaries

Definition: a user defined compact visual representation of the data

during (batch) processing

Governing idea: the visual summary is used to visualize the steps in

batch process

Examples:

General strategy:

• Interactive setup (determine parameters, attributes, etc)

• Batch processing using setup

• Information visualization with visual summaries

Page 12: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

12

User Interface using Visual Summaries

Page 13: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

13

Discriminating groups

Red: HP1 sets, Green: control

Region granularity vs number of spots in region

Granularity attribute

• Average intensity gradient of region

Plot tells us:

• Large variation, some outliers

• HP1 and control seem different

Page 14: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

14

Large variation, some outliers

Brush / link outliers

• Investigate visual summary

Problems with data set

• Corrupt data

Page 15: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

15

HP1 and control seem different

Further analysis

• Histograms

• Box plots

Statistical tests

• Wilcoxon

Wilcoxon tells us that there is indeed a

significant difference

Page 16: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

16

Lessons learned

Showing a significant difference in granularity vs number of spots tells us that the HP1 effects the structure of chromatin. The effect is that chromatin is condensed in a number of compact regions.

Biological significant result. Two papers published

Strategy for analysis of collections of confocal data sets

• Interactive visualization and batch processing are both needed

• Information visualization is used for the analysis of batch output

• Visual summaries are used to link back to original data set or

previous steps in batch process

Strategy has been implemented as the Argos system

Page 17: Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,

17

Generality

Argos has been used for the analysis of an experiment consisting of 2500+ confocal data sets

Argos has been used for the analysis of micro array data