bigdata in iot #iotconfua

[Big]Data in IoT:from Lambda architecture to predictive maintenance

by Tatyana Matvienko

IoT… Haven’t heard?

The Internet of Things is...

● Healthcare

● Energy delivery (water, oil & gas)

● Connected city

● Smart vehicles (cars, elevators...)

● Security sensors and devices

● Monitoring and analytic systems

● Smart...

The problemDevice

● Electronics knowledge

● Master-Slave architecture

● OSI model● Binary protocol● Cloud connectivity

Server

● Meta- and time-series data

● Data storage● Business logic● Data analysis● Client applications

Orchestration

● Scalability● Fault-tolerance● Administration● Responsibility

SolutionDeviceHive

Machine to Machine (M2M) open source Communication Framework

What is DeviceHive?

Firmware

● Install Ubuntu Snappy Core on your device

● BLE support● Firmwares● Gateways

Device Server

Java and .NET based servers

● REST & Websockets● API libraries in

different languages (JS, C, C#, Python...)

● Lambda-architecture

Orchestration

Simplify deployment as much as possible

● Documentation● Playgrounds● Wrap the containers

REST/WebsocketsBINARY BINARY

Gateway Gateway

Device management

and monitoringData analysis and

visualisationBigData StorageReal time data

processing

What it used to be...● JavaEE, EJB, Glassfish● Hibernate with RDBMS for meta and operational data

(PostgreSQL)● REST and Websockets handled the same way● Hazelcast for both - caching and messaging ● Integration with Docker● Admin console with the latest 100 notifications in the table (JSON)

Step 1: Java server

One day your IoT project becomes BigData challenge...

Lambda-architecture ○ Batch Layer○ Serving Layer○ Stream Layer

Lambda-architecture in DeviceHiveLambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch- and stream-processing methods.

What was done...

● Message bus added (Kafka)● Only metadata in RDBMS● Operational data to message bus and cache only● REST endpoints are served from cache only● Cassandra worker streaming data from Kafka to C*● Moved to Spring and Spring boot

Step 2: Deployment

So, let’s setup our project...

● Download and build sources● Setup database● Configure Glassfish● Configure Kafka● Configure Zookeeper● Configure Cassandra (if needed

of course...)● Run● Fail

Welcome to Juju

Mesos & Marathon

PintostackContainerization

● Every service is a container● Container is more than just an app, it is an environment● Container is a final build artifact● Containers for HDFS nodes, Databases, Microservices, Web Applications● Docker Containers are the first class citizens

Key Components● Infrastructure: supply of resources● Containers● Resource abstraction● Scheduling● Service discovery● Logging

Step 3: Analysis and visualisation

Freeboard.io

● A web-based tool● Fully customizable

and interactive user-interfaces

● Real time dashboards

ELK (Elasticsearch + Logstash + Kibana)Open source, scalable solution to search, analyze, and visualize your data, allowing you to get actionable insight in real time

Spark Streaming + Zeppelin

Apache Spark

Q&A?...

bigdata in iot #iotconfua

Engineering