regions of interest. what’s in a roi? use cases requirements current storage system problems ...

34
Alternative Storage Regions of Interest

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Alternative StorageRegions of Interest

Page 2: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Overview

What’s in a ROI? Use cases

Requirements Current Storage System

Problems Alternative Storage

Page 3: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

What’s in an ROI?

ROI Geometry Measurements ROI on Channel Annotations ▪ ROI▪ Measurement▪ Links

Page 4: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Use Cases

User created ROI Measurement tools

HCS generated ROI Automatic External

External analysis Particle Tracking Other

Templates ROIs without images

Page 5: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Use Cases – Human Generated Human generated

More interactions▪ Merge, Propagate, Split, Delete

Measurements▪ Geometry▪ Intensity▪ Path

ROI/ROI Links Tags mostly on ROI Write Many/Read Many

Page 6: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Use Cases - HCS

HCS Generated ROI Lots of ROI Attached to Channel Measurements Attached▪ Multiple measurements

Tags on ROI, Measurements▪ Analysis, results and meta.

Write Once, Read Many

Page 7: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Use Cases – External Tools

External Tool can Generate ROI (+ scripts) Can be tagged Links (ROI/ROI, ROI/Image) Results can be in any format

Page 8: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Use Cases - Templates

ROI need not be attached to image Template to define other ROI

Page 9: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

ROI from the Nth Dimension

N-Dimensional Data Storage of Image data simple ROI more complex▪ Database entry, file format

We don’t just want to store in HDF

Page 10: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Current Storage Solutions

Database ROI ROI Annotations

PyTables Mask ROI Measurements

Page 11: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Current Status

Pytables ROI are heterogeneous Concurrency Python behind a core service call Measurements are optimal Tagging is an issue▪ Inside file▪ Multiple annotations reported to be slow

Page 12: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Database

ROI can be stored in database Mask data can be an issue Tagging in RBD not best Many more annotations than we’d

like Link to external source for

measurements

Page 13: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Alternative Storage

Key-Value Pair Stores Berkeley DB Project Voldermort Tokyo Cabinet

Document DB MongoDB CouchDB

Graph DB Neo4J InfoGrid

Table DB Cassandra Hypertables HBase

Page 15: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

MongoDB

Document Database NOSQL movement Schemaless No Tables ▪ Collections of like data

No Joins▪ Document is equivalent of row of data▪ Distributed file system (GridFS)

Page 16: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

MongoDB – Pros and ConsPros

It has bindings to numerous languages (C++, C#, Java, Python, ...). Allows storage, indexing, linking of any user data Annotations are now very easy, efficient Has mechanisms for schema upgrade Dynamic Queries Replication Sharding. Map-Reduce framework. Fast. GridFS is a distributed file storage mechanism within Mongo. Easy to install

Cons Schemaless, data integrity will need to be worked on. Graph structures not inherently supported.

Page 17: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

MongoDB - Deployments

DEPLOYMENTS SourceForge  http://sourceforge.net/ BusinessInsider

 http://www.businessinsider.com/ New York Times

 http://www.nytimes.com/ Disqus  http://www.disqus.com/

Page 18: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

MongoDB – ROI Use casesHuman Interaction

Merge, Propagate, Split ✓

Geometry ✓

Intensity ✓

Path ✓

ROI/ROI Links ✓

Tags ✓

HCS

Many ROI ✓

Tags on ROI ✓

Tags on Measurement ✓

Tables of Measurements ✓

Externally Generated

Tags ✓

ROI/ROI Links, ROI/Image Links

Many formats, unknown types ✓

Other

N-Dimensional ROI ✓

Hierarchical Structures ✓

Page 19: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

MongoDB – Example insert

connection = Connection();db = connection['databaseName'];collection = db.['collectionName']; collection.insert({"tags" : [ ], "label" : “MyROI”, "shapes" : [{

"tags" : [{"tag" : "foo1", "namespace" : "bob"}],"rx" : 17,"ry" : 17,"label" : null,"cy" : 75,"cx" : 3,"t" : 0,"z" : 0,"type" : "Ellipse","id" : 3

},{

"tags" : [{"tag" : "foo2", "namespace" : "bob"}],"rx" : 10,"ry" : 16,"label" : null,"cy" : 82,"cx" : 45,"t" : 0,"z" : 0,"type" : "Ellipse","id" : 5

}], "type" : "Roi", "id" : 565 })

Page 20: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

MongoDB – Example query

connection = Connection();db = connection['databaseName'];collection = db.['collectionName'];collection.find({"shapes.tags.tag":'/.*mitosis.*/i'})

connection = Connection();db = connection['databaseName'];collection = db.['collectionName'];collection.find({”shapes.tags.tag”:”foo1”,”tags.tag”:”foofoo”})

Find roi with tag foofoo and shapes with tag foo1

Find roi shapes with tag containing mitosis

Page 21: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Neo4J

Graph Database use nodes to represent objects User specifies relationship between

nodes Allows complex traversal of node

structures

Page 22: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Neo4J – Pros and Cons

PROS Handles graph structures nicely Transactional Supported by Gremlin  Gremlin Native RDF

 http://components.neo4j.org/neo-rdf-sail / Easy to install CONS No C++ language binding. Not distributed. Tables are not so easily modeled. Difficult to query on node contents

Page 23: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Neo4J - Deployments

DEPLOYMENTS The Swedish Defence forces

 http://www.mil.se Windh Technologies

 http://www.windh.com Flextoll  http://www.flextoll.se

Page 24: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Neo4J - Examplepublic enum OMERORelations implements RelationshipType{ ASSOCIATE, DERIVE, AGGREGATE, COMPOSE}

Node image = neo.createNode();image.setProperty("IObject",imageI);image.setProperty("id",imageI.getId().getValue());image.setProperty("name",imageI.getName().getValue());

Node derivedImage = neo.createNode();derivedImage.setProperty("IObject",derivedImageI);derivedImage.setProperty("id",derivedImageI.getId().getValue());derivedImage.setProperty("name",derivedImageI.getName().getValue());

Relationship relationship = image.createRelationshipTo( derivedImage, OMERORelations.DERIVE );relationship.setProperty("type","ROI");relationship.setProperty("operation","crop");relationship.setProperty("roi",cropRoiI);

Page 25: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Neo4J – ROI Use casesHuman Interaction

Merge, Propagate, Split ✓

Geometry

Intensity

Path ✓

ROI/ROI Links ✓

Tags

HCS

Many ROI ✓

Tags on ROI ✓

Tags on Measurement ✓

Tables of Measurements

Externally Generated

Tags ✓

ROI/ROI Links, ROI/Image Links ✓

Many formats, unknown types

Other

N-Dimensional ROI

Hierarchical Structures ✓

Page 26: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Cassandra

Implementation of Google’s BigTables, is a complex implement of a key/value store to represent a table.

A sophisticated toolset is required to get the most out of this solutions, for instance Google has created  sawzall to query this system. Digg have released a language to work with Cassandra called  LazyBoy.

Works by creating a table which has columns linked together called column families, like data will exist in the same column family (Ellipse ROI).

Page 27: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Cassandra – Pros and ConsPros Quick Handles heterogeneous data well

Different rows can have different columns Can manage distributed data

Map/Reduce Focus on writes not reads Scales nicely Easy to Install

Cons Not simple to work with

Building hierarchical structures Sorting Querying

▪ Ad Hoc Queries are bad, Digg still use MySQL for certain queries. Have to manage secondary indexes, (K/V)

Version 0.5

Page 28: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Cassandra - Deployments

Deployments Facebook (MAYBE!!)

http://www.facebook.com Digg http://www.digg.com

Page 29: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Cassandra – ROI Use cases

Human Interaction

Merge, Propagate, Split ✓

Geometry ✓

Intensity ✓

Path

ROI/ROI Links

Tags ✓

HCS

Many ROI ✓

Tags on ROI ✓

Tags on Measurement ✓

Tables of Measurements ✓

Externally Generated

Tags ✓

ROI/ROI Links, ROI/Image Links ✓

Many formats, unknown types

Other

N-Dimensional ROI ✓

Hierarchical Structures

Page 30: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

HyperTable

Implementation of Google’s BigTables, is a complex implement of a key/value store to represent a table.

A sophisticated toolset is required to get the most out of this solutions, for instance Google has created  sawzall to query this system. HyperTable has a query language call HQL.

Works by creating a table which has columns linked together called column families, like data will exist in the same column family (Ellipse ROI).

Page 31: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Hypertable – Pros and ConsPros Quick Handles heterogeneous data well

Different rows can have different columns Can manage distributed data

Map/Reduce Scales nicely Easy to Install

Cons GPL License Building hierarchical structures Docs are weak HQL works for simple queries only

Map/Reduce for other work limit of 255 column families Secondary keys

Page 32: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

HyperTable- Deployments

Deployments Rediff http://www.rediff.com Zvents http://www.zvents.com/

Page 33: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

HyperTable – ROI Use cases

Human Interaction

Merge, Propagate, Split ✓

Geometry ✓

Intensity ✓

Path

ROI/ROI Links

Tags ✓

HCS

Many ROI ✓

Tags on ROI ✓

Tags on Measurement ✓

Tables of Measurements ✓

Externally Generated

Tags ✓

ROI/ROI Links, ROI/Image Links ✓

Many formats, unknown types

Other

N-Dimensional ROI ✓

Hierarchical Structures

Page 34: Regions of Interest.  What’s in a ROI?  Use cases  Requirements  Current Storage System  Problems  Alternative Storage

Are we Normal?

Why do we have an RDMS We don’t normalise the data

Each import will normalise on:▪ Image, ObjectiveSettings, LogicalChannel,

LightSettings, Detector Settings. Object Penalty Difference between normalisation and

view