informix mqtt streaming

35
© 2015 IBM Corporation IBM Analytics Spark Analytics with Informix Pradeep Natarajan, IBM @pradeepnatara

Upload: pradeep-natarajan

Post on 13-Apr-2017

461 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: Informix MQTT Streaming

© 2015 IBM Corporation

IBM Analytics

Spark Analytics with Informix

Pradeep Natarajan, IBM@pradeepnatara

Page 2: Informix MQTT Streaming

2

Agenda Context: Informix / Spark high-level value propositions IoT use-cases Challenges Prototype and implementation What’s next?

Page 3: Informix MQTT Streaming

3

Informix to Spark

Context

Page 4: Informix MQTT Streaming

4

Informix for Internet of Things• Optimized Database for environments, such as:

• Low or no database administration• Embedded: gateways, routers

• Very high transaction rates and uptime characteristics • Widely deployed in the retail sector, where the low administration

overhead makes it essential for in-store deployments.• Informix supports key Internet-of-Things solutions

• Native support for time-based data: Timeseries • Small footprint• Low administration requirements

Page 5: Informix MQTT Streaming

5

Apache Spark Speed Ease of use, Unified Engine Sophisticated analytics

Page 6: Informix MQTT Streaming

6

Apache Spark

• Cluster computing framework• Fast and general engine for large-scale data processing• In-memory computing

Page 7: Informix MQTT Streaming

7

Apache Spark Streaming

Extends Spark for big data stream processing

ROW DATA STREAM Processed DataDistributed Stream Processing System

Scaling, low latency, Recovery Integrate Batch and interactive processing

Page 8: Informix MQTT Streaming

8

Informix to Spark

Use cases

Page 9: Informix MQTT Streaming

9

Real-Time Operational Database Streaming Analytics with Spark

Applications that drive business have positioned relational databases at the center of operations.

To continue their success, businesses need to use streaming analytics to gain real-time insights into their operations and take actions to optimize outcomes.

Infrequent batch analytics on “stale” data losing competitive edge. Increasing demand for real-time analytics to stay in the lead.

Page 10: Informix MQTT Streaming

10

SENSE -> ANALYZE -> ACT As data ages, business value diminishes. Sense → Analyze → Act in seconds/ milliseconds, not days

or weeks

Sense

Analyze

ActSense

Analyze

ActDays Days

Seconds

Days

Seconds

Batch

Real-time

Page 11: Informix MQTT Streaming

11

Connected Vehicles Energy & Utilities Health Care

Driving behavior matching Power consumption

Continuously streaming data from IBM Informix to analytics platform

Streaming analytics service sample scenarios

How does power consumption

correlated between House

A,B,C D?

Detect abnormal patterns in ECG

series

Detect the anomaly driving behavior cause higher fuel

consumptions

Increasing demand for real-time analytics

Finance

Detect the anomaly by price change rate

in time window

Steady price change

Vibration in short period

Market Manipulation Detection Heart Attack Prevention

Cloud Service Operation

Detect the system resource peak and valley, correlates with workload

information

Server health diagnosis

Page 12: Informix MQTT Streaming

12

Real-time analytics - Industry

Information technology – Systems & Network monitoring IoT - sensor data analytics and processing Financial transactions – authentication, fraud detection,

validation Inventory control – consumer trends and demands Website analytics – ad targeting Many others….

Page 13: Informix MQTT Streaming

13

Real-time analytics - applications

Data analyzed as it arrives – data in motion Simple: Monitoring, alerts/reports, statistics Complex: predictive analytics (regressions,

machine learning, etc…), K-means clusters (classification, anomaly detection)

Many store events as well, combine with later batch processing.

Immediate actions possible.

Page 14: Informix MQTT Streaming

14

Informix to Spark

Challenges

Page 15: Informix MQTT Streaming

15

Exploring data and discovering actionable business insights

The problem - Often users will not know what exact analytics they want to do

Difficult to justify cost/risk of a complex solution without specific business value

Need to reduce the cost/risk of adding real-time data analytics pipeline to application architecture

Let data scientist explore data to find useful data analytics without interfering with existing business.

Page 16: Informix MQTT Streaming

16

We're running an Informix database. How to incorporate real-time analytics into our

application architecture?

Application Server Database

Page 17: Informix MQTT Streaming

17

Out-dated approach - requires additional complexityIncreased risk and cost.

Application Server

Additional Component

Additional Component

Page 18: Informix MQTT Streaming

18

Informix to Spark

Prototype Implementation

Page 19: Informix MQTT Streaming

19

Real-Time Operational Database Streaming Analytics with Spark

Newly prototyped feature for the Informix database. Enables Informix customers to stream data added to their

database in real-time via MQTT, which can then be consumed by an analytics platform such as Apache Spark.

Page 20: Informix MQTT Streaming

20

Informix MQTT Streamer – Enable real-time analytics pipeline which drastically

reduces complexity, cost and risk

Informix MQTTStreamer

Page 21: Informix MQTT Streaming

21

How is it implemented?

Uses Informix Virtual-Index Interface (VII)VII allows us to write UDRs that will be triggeredwhenever certain SQL statements are executedThis is typically used to create indexes for customdata types. Instead, we use it to write data to a socket during INSERT/UPDATE statements

VII UDR:Publish to MQTT broker

MQTT broker

Page 22: Informix MQTT Streaming

22

Installation and basic usage

Open Sourced! Available on github –

https://github.com/IBM-IoT/InformixSparkStreaming Run install script Add the streaming index to the column whose values

you want to stream

create index stream on table(col1, col2) USING streaming_index;

Page 23: Informix MQTT Streaming

The Nitty gritty

• Installed into Informix is a set of custom UDRs that convert data into MQTT messages and sends them to a specified address

• Virtual Table Indexes detect data insert/update/deletes as they happen and trigger the messages to be sent

• Once in an MQTT broker, almost anything can consume it– MQTT clients available for most programming languages (include

Java for Apache spark)• Spark can analyze the data, compare it to historical data,

use streaming k-means algorithms to determine changes in the data

Page 24: Informix MQTT Streaming

24

The Nitty gritty continued Once installed, the custom “streaming_index” index type

will be available for use. Running the “create index” command and specifying to use

the “streaming_index” index type will run the code in the custom UDRs that will push the data via MQTT.

Then, whenever you run the INSERT statement on the column that you created the streaming index on, the data that you inserted will automatically be published to an MQTT broker.

See the “IBM Informix Virtual-Index Interface Programmer's Guide” for more details.

Page 25: Informix MQTT Streaming

25

In-depth Does the prototype work for Temp. tables?

No specific index-related restrictions to temp. tables Do we lock the tables?

The VII will lengthen the amount of time a lock is held Future item - multiple concurrent writers to a per-table

queue, flushed asynchronously by a separate thread Would this work for multi-nodes (sharding)?

The current prototype is really delegating this to Spark, where multiple input streams could be merged into one

Page 26: Informix MQTT Streaming

26

In-depth

Installs in seconds No need to upgrade database No need to restart database server Can be installed and activated on a live production

database! Minimal interference with existing business

application

Page 27: Informix MQTT Streaming

27

Informix to Spark

Demo

Page 28: Informix MQTT Streaming
Page 29: Informix MQTT Streaming

Heart To Spark

• Demonstration for real time streaming of data from the Informix engine into a message broker for digestion by one or more services

• Simulates IOT data from a heart rate monitor• Watches for trends in heart rates

– Poor health/stress can cause a rise in baseline heartrate which is measurable

• Uses Spark Analytics to determine baseline heartrates and plots the trend (heartrate rising, steady, or falling)

• Graphing tools in browser show us a view of the data

Page 30: Informix MQTT Streaming

30

Demo - Installation

Page 31: Informix MQTT Streaming

Heart Monitor

Informix

Message Broker

Apache Spark

Analytics

Display Results

IOT devices send data into the Informix

server

Data Streams from Informix into an MQTT

broker

From MQTT Data is streamed into Spark for real-time Analysis

Results from both Informix and Spark available to the end

user

Overview

Page 32: Informix MQTT Streaming

32

Not limited to Apache Spark

Can be used by any application/platform that can consume TCP socket data.

IBM Infosphere Streams Apache Storm Custom applications (most programming languages

have MQTT libraries) Many, many others.

Page 33: Informix MQTT Streaming

33

Informix to Spark

What’s next?

Page 34: Informix MQTT Streaming

34

Endless possibilities

Check out Apache Spark for more information about analytics and machine learning

http://spark.apache.org/ Learn more about Machine Learning and its

potential https://www.coursera.org/learn/machine-learning Contact IBM Informix

Page 35: Informix MQTT Streaming

35

Questions?

Pradeep Natarajan@pradeepnatara

35