bigdata in iot #iotconfua
TRANSCRIPT
[Big]Data in IoT:from Lambda architecture to predictive maintenance
by Tatyana Matvienko
IoT… Haven’t heard?
The Internet of Things is...
● Healthcare
● Energy delivery (water, oil & gas)
● Connected city
● Smart vehicles (cars, elevators...)
● Security sensors and devices
● Monitoring and analytic systems
● Smart...
The problemDevice
● Electronics knowledge
● Master-Slave architecture
● OSI model● Binary protocol● Cloud connectivity
Server
● Meta- and time-series data
● Data storage● Business logic● Data analysis● Client applications
Orchestration
● Scalability● Fault-tolerance● Administration● Responsibility
SolutionDeviceHive
Machine to Machine (M2M) open source Communication Framework
What is DeviceHive?
Firmware
● Install Ubuntu Snappy Core on your device
● BLE support● Firmwares● Gateways
Device Server
Java and .NET based servers
● REST & Websockets● API libraries in
different languages (JS, C, C#, Python...)
● Lambda-architecture
Orchestration
Simplify deployment as much as possible
● Documentation● Playgrounds● Wrap the containers
REST/WebsocketsBINARY BINARY
Gateway Gateway
Device management
and monitoringData analysis and
visualisationBigData StorageReal time data
processing
What it used to be...● JavaEE, EJB, Glassfish● Hibernate with RDBMS for meta and operational data
(PostgreSQL)● REST and Websockets handled the same way● Hazelcast for both - caching and messaging ● Integration with Docker● Admin console with the latest 100 notifications in the table (JSON)
Step 1: Java server
One day your IoT project becomes BigData challenge...
Lambda-architecture ○ Batch Layer○ Serving Layer○ Stream Layer
Lambda-architecture in DeviceHiveLambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch- and stream-processing methods.
What was done...
● Message bus added (Kafka)● Only metadata in RDBMS● Operational data to message bus and cache only● REST endpoints are served from cache only● Cassandra worker streaming data from Kafka to C*● Moved to Spring and Spring boot
Step 2: Deployment
So, let’s setup our project...
● Download and build sources● Setup database● Configure Glassfish● Configure Kafka● Configure Zookeeper● Configure Cassandra (if needed
of course...)● Run● Fail
Welcome to Juju
Mesos & Marathon
PintostackContainerization
● Every service is a container● Container is more than just an app, it is an environment● Container is a final build artifact● Containers for HDFS nodes, Databases, Microservices, Web Applications● Docker Containers are the first class citizens
Key Components● Infrastructure: supply of resources● Containers● Resource abstraction● Scheduling● Service discovery● Logging
Step 3: Analysis and visualisation
Freeboard.io
● A web-based tool● Fully customizable
and interactive user-interfaces
● Real time dashboards
ELK (Elasticsearch + Logstash + Kibana)Open source, scalable solution to search, analyze, and visualize your data, allowing you to get actionable insight in real time
Spark Streaming + Zeppelin
Apache Spark
Q&A?...