real time video analytics with infosphere streams, opencv and r
TRANSCRIPT
© 2013 IBM Corporation
Take action on sensor data in real-time based on analytics in R Easy streaming analytics with InfoSphere Streams Stephan Reimann – IT Specialist Big Data - [email protected] d Wilfried Hoge – IT Architect Big Data – [email protected]
Real time video analytics with InfoSphere Streams, OpenCV and R data2day conference 2014, Karlsruhe Stephan Reimann – IT Specialist Big Data – [email protected] @stereimann
Wilfried Hoge – IT Architect Big Data – [email protected] @wilfriedhoge
© 2014 IBM Corporation
Motivation: Use machine data to make machines smarter
Modern machines produces an incredible amount of data
Use machine generated data to
– make machines more efficient
– reduce downtimes with better maintenance management
– prevent failures
-> make machines smarter
Also use unstructured data such as video
Use that data in real time
2
© 2014 IBM Corporation
The demo scenario: Imagine a tunnel drill equipment where the conveyor belt is continuously supervised by a video camera
What if you can detect a problem in real time, and take
an appropriate action such as stopping the machine to
prevent damage?
Our demo focuses on analyzing the data from a single
camera to make it easy to understand; in a real life
scenario there are usually many structured and
unstructured data sources that are most likely combined
(e.g. analyzing the image data together with speed info)
And since we did not have
one, we created one
3
© 2014 IBM Corporation
Traditional approach
– Historical fact finding
– Analyze persisted data
– (Micro-) Batch philosophy
– PULL approach
Streaming analytics
– Analyze the current moment / the now
– Analyze data directly “in Motion” – without
storing it
– Analyze data at the speed it is created
– PUSH approach
Streaming analytics is a paradigm shift from pull to push analytics in real time, directly „on the wire“, data does not need to be persisted
Repository Insight Analysis Data Insight Analysis Data
4
© 2014 IBM Corporation
LIVE - DEMO
5
© 2014 IBM Corporation
We have used standard algorithms from OpenCV to extract the inte- resting part of the pictures by learning and removing the background
We are only interested in the objects that are on the conveyor belt, not in the conveyor belt
But we don‘t know which objects will pass there, there may be many different
One approach is to describe the background and filter it out, in other words: outlier analysis
We have used a standard algorithm (CodeBook) from OpenCV (open source image
analytics library)
6
© 2014 IBM Corporation
The background removal included sevaral steps such as data preparation & cleansing and background detection & removal
7
© 2014 IBM Corporation
The background removal included sevaral steps such as data preparation & cleansing and background detection & removal
Filter:
Select the area
of interest
Cleanse:
Reduce the
noise level
Analyze & Transform:
Learn the background and
create a mask:
black=background,
white=foreground
Cleanse:
Reduce the noise
level of the back-
ground detection
Transform:
Combine the background detec-
tion with the original image, it‘s
basically a logical AND
Just for visualization:
Create the blue separator image
Publish:
The Export operators provide the data to other streaming
analytics applications (here: the visuali-
zation & the color analytics) via publish & subscribe
8
© 2014 IBM Corporation
Features
Background
Frequencies Spectrum Edges
Camera Motion
Energy Zero-crossings
Models
P P P P P P P
P P P
Positive Examples
Negative Examples
N N N N N N N
N N N
Labeled Data Unlabeled Data
Addaboost
K-means
Regression
Bayes Net
Nearest Neighbor
Neural Net
Deep Belief Nets GMM Clustering
Markov Model
Decision Tree
Expectation Maximization
Factor Graph
Shot Boundaries
Semantics
Multimedia Data
Scenes
Locations
Settings Objects
Activities
Actions
Objects Actions
Behaviors
People
Objects
Living
Cars Animals
People
Vehicles
Activities
Scenes
People
Places Faces
Objects
Events Activities
GMM SVMs
Shape Texture
Ensemble Classifiers
Motion
Moving Objects
Active Learning
Regions
Scene Dynamics
Tracks
Color
One approach to image analytics is extracting features and using a variety of statistical/mathematical concepts to deduce the semantics
9
© 2014 IBM Corporation
Visual Features Spatial Granularities
Spatial-Frequency Information
Sp
atia
l In
form
ation
Distributio
n
Local
Texture
Color
Wavelet
Tamura
Texture
Wavelet
Texture
Color
Wavelet
Texture Spatial
Relation
Edges/Shape
Shape
Moments
Edge
Histogram
Siftogram
Fourier
Shape
Image Type
Image
Statistics
Dominant
Colors
Spatial
Scales
Scale-
Orientation
Hough Circle
Max-
Response
Filters
Curvelets
Color (Pixels)
Color
Correlogram
Color
Moments
Interest
Points
Thumbnail
Image
Local Binary
Patterns
Color
Histogram
Complexity
1
3
2
Global
Pyramid3
Horiz. Parts
Vertical
Horizontal
Layout
Pyramid
Grid
Cross
Center
Typical image features used for analytics include color, shapes, texture and many more, we have focussed on color for the demo
10
© 2014 IBM Corporation
We have calculated several color features and the object‘s area, now we can use it for calculations / analytics
Area (in pixel)
Absolut Color Values
Color Histogram
The cool thing:
Now you have attributes! It‘s structured!
You can directly use it or combine it with
other data sources, e.g. calculate conveyor
belt throughput based on area and speed
information.
Analytics:
Calculates the
color attributes
and the area
Import:
Receives the data from the
background separation app
via subscribe
Visualization:
Write the text
and draw the
color histogram 11
© 2014 IBM Corporation
We have „marked“ the structured data from the color analytics application and used it to train a model to detect object classes
Describing explicitly what is characteristic for an
object class is difficult/impossible.
We have used the numbers to let the algorithm
behind the model learn it.
The algorithm just needs the marked data
(=training data set).
Marked data means we provided the information
which object class was visible at which time.
12
© 2014 IBM Corporation
The model is created when the application is started based on the training data, and predicts the object class for each image in real time
We have used R (an Open Source package for statistics and advanced
analytics) to create the predictive model
The model is created when the streaming analytics application is started
Once the application is running, the individual score and the prediction are
calculated for each individual image (or in other words: the predictive
model is applied), this is called scoring
In our demo the model is only trained once at startup and maintains
constant afterwards, but it is also possible to refresh models continuously
or in certain intervals
Import:
Receives the data from or
analytics app via subscribe
Visualization:
Visualizes the results
Visualization:
Write the prediction as text
on the image
13
© 2014 IBM Corporation
Features
Background
Frequencies Spectrum Edges
Camera Motion
Energy Zero-crossings
Models
P P P P P P P
P P P
Positive Examples
Negative Examples
N N N N N N N
N N N
Labeled Data Unlabeled Data
Addaboost
K-means
Regression
Bayes Net
Nearest Neighbor
Neural Net
Deep Belief Nets GMM Clustering
Markov Model
Decision Tree
Expectation Maximization
Factor Graph
Shot Boundaries
Semantics
Multimedia Data
Scenes
Locations
Settings Objects
Activities
Actions
Objects Actions
Behaviors
People
Objects
Living
Cars Animals
People
Vehicles
Activities
Scenes
People
Places Faces
Objects
Events Activities
GMM SVMs
Shape Texture
Ensemble Classifiers
Motion
Moving Objects
Active Learning
Regions
Scene Dynamics
Tracks
Color Color
Decision Tree
The demo has shown image analytics on one feature and model, in reality a combination of several features & models is used
14
© 2014 IBM Corporation
A freely available Webcast from IBM Research provides further insights into image and video analytics and the theorie behind
IBM Analytics Education Series: Lecture 7 - Multimedia - Image and Video Analytics
15
© 2014 IBM Corporation
R
– Open Source software for statistics and advanced analytics
– http://cran.r-project.org/
We have used InfoSphere Streams for the real time analytics and have extended it with R and OpenCV for the implementation
OpenCV
– Open Source computer vision and machine learning software library
– http://opencv.org/ & InfoSphere Streams OpenCV Toolkit on GitHub
InfoSphere Streams
– Software for real time analytics on any kind of Big Data
Free Quickstart Edition
Developer Community
www.ibmdw.net/streamsdev/ ibm.co/streamsqs
+ Tutorials,
Labs,
Forum, ...
GitHub Community
github.com/IBMStreams
+ Toolkits,
Toolkits,
Toolkits
16
© 2014 IBM Corporation
InfoSphere Streams is the result of an IBM research project, designed for high-throughput, low latency and to make streaming analytics easy
Scale out
Millions of Events per Second
Complex Data & Analytics
All kinds of data
Complex analytics: Everything you
can express via an algorithm
Low Latency
Analyzes data at the speed it is
created
Latencies down to µs
Immediate action in real time
+ +
Info
Sp
he
re S
tre
am
s
Ca
pa
bil
itie
s
Ho
w it
wo
rks
– Define apps as flow graphs consisting of
sources (inputs), operators & sinks (outputs)
– Extend the functionality with your code if
required for full flexibility
– The clustered, distributed runtime on
commodity HW scales nearly limitless
– GUIs for rapid development and
operations make streaming analytics easy
17
© 2014 IBM Corporation
Telecommunication Transport Manufacturing
Security Radio astronomy Healthcare
Industrie 4.0 Energy & Utilities Connected Car
... optimizes the traffic in
Stockholm and Dublin
... analyzes acoustic
signals to protect
sensible areas
... optimizes the quality
of mobile networks
... is the foundation for
real-time campaign to
increase customer satis-
faction and revenues
... analyzes and selects
images in real-time within
the world‘s largest radio
telescope
... and is a core
component within many
innovation initiatives
Pre
sent
/ In
pro
ductio
n
Tre
nds
Pro
toty
pes
InfoSphere Streams is already used in a broad range of real time analytics applications across industries
18
© 2014 IBM Corporation
Where technology meets business potential: Start making sense of your data (in real time), it is possible!
Gain
valu
e f
rom
yo
ur
data
19
There are many opportu-
nities to gain value from
data. Let‘s talk how to
make sense of your data! http://www-05.ibm.com/de/events/workshop/bigdata/
Make maintenance more
predictable to reduce downtimes
Detect error patterns to prevent
failures
Better understand complex
systems and their dependencies
to improve efficiency