a real-time architecture using hadoop and storm at devoxx
DESCRIPTION
TRANSCRIPT
![Page 1: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/1.jpg)
@nathan_gs#DV13-#rtbigdata
A real-time architecture using Hadoop and Storm.
Nathan Bijnens
![Page 2: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/2.jpg)
@nathan_gs#DV13-#rtbigdata
Speaker
Nathan BijnensDataCrunchers@nathan_gs
![Page 3: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/3.jpg)
@nathan_gs#DV13-#rtbigdata
Our Vision
Big Data
test
Volume
![Page 4: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/4.jpg)
@nathan_gs#DV13-#rtbigdata
Big Data
test
Velocity
![Page 5: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/5.jpg)
@nathan_gs#DV13-#rtbigdata
Our Vision
Volume
test
Variety
![Page 6: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/6.jpg)
@nathan_gs#DV13-#rtbigdata
Computing Trends
Source: Immutability Changes Everything - Pat Helland, RICON2012
Computation (CPUs) Expensive
Disk Storage Expensive
Coordination Easy(Latches Don’t Often Hit)
DRAM Expensive
Computation Cheap (Many Core Computers)
Disk Storage Cheap(Cheap Commodity Disks)
Coordination Hard(Latches Stall a Lot, etc)
DRAM / SSD Getting Cheap
Past Current
![Page 7: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/7.jpg)
@nathan_gs#DV13-#rtbigdata
Credits
Nathan Marz•Ex-Backtype & Twitter
•Startup in Stealthmode
•Storm
•Cascalog
•ElephantDB
manning.com/marz
![Page 8: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/8.jpg)
@nathan_gs#DV13-#rtbigdata
A Data System
![Page 9: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/9.jpg)
@nathan_gs#DV13-#rtbigdata
Not all information is equal.
Some information is derived from other pieces of information.
Data is more than Information
![Page 10: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/10.jpg)
@nathan_gs#DV13-#rtbigdata
Eventually you will reach the most ‘raw’ form of
information.This is the information you hold true,
simple because it exists.
Let’s call this ‘data’, very similar to ‘event’.
Data is more than Information
![Page 11: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/11.jpg)
@nathan_gs#DV13-#rtbigdata
Events used to manipulate the master
data.
Events - Before
![Page 12: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/12.jpg)
@nathan_gs#DV13-#rtbigdata
Today, events are the master data.
Events - After
![Page 13: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/13.jpg)
@nathan_gs#DV13-#rtbigdata
Let’s store everything.
Data System
![Page 14: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/14.jpg)
@nathan_gs#DV13-#rtbigdata
Data is Immutable
Events
![Page 15: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/15.jpg)
@nathan_gs#DV13-#rtbigdata
Data is Time Based
Events
![Page 16: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/16.jpg)
@nathan_gs#DV13-#rtbigdata
Capturing change traditionally
Person Location
Nathan Antwerp
Geert Dendermonde
John Ghent
Person Location
Nathan Ghent
Geert Dendermonde
John Ghent
![Page 17: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/17.jpg)
@nathan_gs#DV13-#rtbigdata
Capturing change
Person Location Time
Nathan Antwerp 2005-01-01
Geert Dendermonde 2011-10-08
John Ghent 2010-05-02
Nathan Ghent 2013-02-03
Person Location Timestamp
Nathan Antwerp 2005-01-01
Geert Dendermonde
2011-10-08
John Ghent 2010-05-02
![Page 18: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/18.jpg)
@nathan_gs#DV13-#rtbigdata
The data you query is often transformed, aggregated, ...
Rarely used in it’s original form.
Query
![Page 19: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/19.jpg)
@nathan_gs#DV13-#rtbigdata
Query
Query = function ( all data )
![Page 20: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/20.jpg)
@nathan_gs#DV13-#rtbigdata
Number of people living in each city.
Person Location Time
Nathan Antwerp 2005-01-01
Geert Dendermonde
2011-10-08
John Ghent 2010-05-02
Nathan Ghent 2013-02-03
Location Count
Ghent 2
Dendermonde 1
![Page 21: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/21.jpg)
@nathan_gs#DV13-#rtbigdata
Query
All Data Query
![Page 22: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/22.jpg)
@nathan_gs#DV13-#rtbigdata
Query: Precompute
All Data Query
Precomputed View
![Page 23: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/23.jpg)
@nathan_gs#DV13-#rtbigdata
Layered Architecture
Speed Layer
Batch Layer
Serving Layer
![Page 24: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/24.jpg)
@nathan_gs#DV13-#rtbigdata
Layered Architecture
HadoopElephan
tDB
Query
Incoming Data
Cassandra
![Page 25: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/25.jpg)
@nathan_gs#DV13-#rtbigdata
Batch Layer
![Page 26: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/26.jpg)
@nathan_gs#DV13-#rtbigdata
Batch Layer
HadoopElephan
tDB
Incoming Data
![Page 27: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/27.jpg)
@nathan_gs#DV13-#rtbigdata
Unrestrained computation.
Batch Layer
![Page 28: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/28.jpg)
@nathan_gs#DV13-#rtbigdata
No need to De-Normalize.
Batch Layer
![Page 29: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/29.jpg)
@nathan_gs#DV13-#rtbigdata
Horizontal scalable.
Batch Layer
![Page 30: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/30.jpg)
@nathan_gs#DV13-#rtbigdata
High Latency.Let’s pretend temporarily that update latency
doesn’t matter.
Batch Layer
![Page 31: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/31.jpg)
@nathan_gs#DV13-#rtbigdata
Functional computation, based on immutable inputs, is idempotent.
Batch Layer
![Page 32: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/32.jpg)
@nathan_gs#DV13-#rtbigdata
Stores master copy of data set...
Batch Layer
append only.
![Page 33: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/33.jpg)
@nathan_gs#DV13-#rtbigdata
Batch Layer
![Page 34: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/34.jpg)
@nathan_gs#DV13-#rtbigdata
Batch: View generation
Master Dataset
View #1
View #3
View #2
MapReduc
e
MapReduce
MapReduce
![Page 35: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/35.jpg)
@nathan_gs#DV13-#rtbigdata
1. Take a large data set and divide it into subsets
2. Perform the same function on all subsets
3. Combine the output from all subsets
…
…
Output
MA
PR
ED
UC
E
MapReduce
DoWork() DoWork() DoWork()…
![Page 36: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/36.jpg)
@nathan_gs#DV13-#rtbigdata
MapReduce
![Page 37: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/37.jpg)
@nathan_gs#DV13-#rtbigdata
Serialization & Schema
Catch errors as quickly as they happen. Validation on write vs on read.
![Page 38: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/38.jpg)
@nathan_gs#DV13-#rtbigdata
Serialization & Schema
CSV is actually a serialization language that is just poorly defined.
![Page 39: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/39.jpg)
@nathan_gs#DV13-#rtbigdata
Serialization & Schema
•Use a format with a schema.
•Thrift
•Avro
•Protobuffers
•Added bonus: it’s faster & uses less space.
![Page 40: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/40.jpg)
@nathan_gs#DV13-#rtbigdata
Read only database.No random writes required.
Batch View Database
![Page 41: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/41.jpg)
@nathan_gs#DV13-#rtbigdata
Every iteration produces the Views from scratch.
Batch View Database
![Page 42: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/42.jpg)
@nathan_gs#DV13-#rtbigdata
Batch View Database
•ElephantDB
•Splout
•Voldemort
•…
![Page 43: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/43.jpg)
@nathan_gs#DV13-#rtbigdata
Batch Layer
Not yet absorbe
d.Data absorbed into Batch Views
Time Now
We are not done yet…Just a few hours of data.
![Page 44: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/44.jpg)
@nathan_gs#DV13-#rtbigdata
Speed Layer
![Page 45: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/45.jpg)
@nathan_gs#DV13-#rtbigdata
Overview
HadoopElephan
tDB
Incoming Data
Cassandra
![Page 46: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/46.jpg)
@nathan_gs#DV13-#rtbigdata
Stream processing.
Speed Layer
![Page 47: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/47.jpg)
@nathan_gs#DV13-#rtbigdata
Continuous computation.
Speed Layer
![Page 48: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/48.jpg)
@nathan_gs#DV13-#rtbigdata
Transactional.
Speed Layer
![Page 49: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/49.jpg)
@nathan_gs#DV13-#rtbigdata
Storing a limited window of data.
Compensating for the last few hours of data.
Speed Layer
![Page 50: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/50.jpg)
@nathan_gs#DV13-#rtbigdata
All the complexity is isolated in the Speed layer.
If anything goes wrong, it’s auto-corrected.
Speed Layer
![Page 51: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/51.jpg)
@nathan_gs#DV13-#rtbigdata
CAP
You have a choice between:
•Availability
•Queries are eventual consistent.
•Consistency
•Queries are consistent.
Consistency
Availability
Partition Toleranc
e
![Page 52: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/52.jpg)
@nathan_gs#DV13-#rtbigdata
Some algorithms are hard to implement in real time. For those
cases we could estimate the results.
Eventual accuracy
![Page 53: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/53.jpg)
@nathan_gs#DV13-#rtbigdata
Speed Layer
Incoming Data
Real Time
View 1
Real Time
View 2
![Page 54: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/54.jpg)
@nathan_gs#DV13-#rtbigdata
Storm
•Message passing.
•Distributed processing.
•Horizontally scalable.
•Incremental algorithms.
•Fast.
•Data in motion.
![Page 55: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/55.jpg)
@nathan_gs#DV13-#rtbigdata
Storm
![Page 56: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/56.jpg)
@nathan_gs#DV13-#rtbigdata
Storm
•Tuple
•Stream
![Page 57: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/57.jpg)
@nathan_gs#DV13-#rtbigdata
Storm
•Spout
•Bolt
![Page 58: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/58.jpg)
@nathan_gs#DV13-#rtbigdata
Storm
•Grouping
![Page 59: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/59.jpg)
@nathan_gs#DV13-#rtbigdata
Data Ingestion
•Kafka
•Flume
•Scribe
•*MQ
•…
![Page 60: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/60.jpg)
@nathan_gs#DV13-#rtbigdata
Speed Layer Views
•The views are stored in Read & Write database.
•Cassandra
•Hbase
•Redis
•MySQL
•ElasticSearch
•…
•Much more complex than a read only view.
![Page 61: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/61.jpg)
@nathan_gs#DV13-#rtbigdata
Serving Layer
![Page 62: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/62.jpg)
@nathan_gs#DV13-#rtbigdata
Overview
HadoopElephan
tDB
Query
Incoming Data
Cassandra
![Page 63: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/63.jpg)
@nathan_gs#DV13-#rtbigdata
Random reads
Serving Layer
![Page 64: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/64.jpg)
@nathan_gs#DV13-#rtbigdata
This layer queries the Batch & Real Time views and merges it.
Serving Layer
![Page 65: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/65.jpg)
@nathan_gs#DV13-#rtbigdata
Serving Layer
Real Time Views
Merge
Batch Views
![Page 66: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/66.jpg)
@nathan_gs#DV13-#rtbigdata
How to query an Average?
Serving Layer
![Page 67: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/67.jpg)
@nathan_gs#DV13-#rtbigdata
Overview
![Page 68: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/68.jpg)
@nathan_gs#DV13-#rtbigdata
Overview
HadoopElephan
tDB
Query
Incoming Data
Cassandra
![Page 69: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/69.jpg)
@nathan_gs#DV13-#rtbigdata
CQRS
Source: martinfowler.com/bliki/CQRS.html – Martin Fowler
![Page 70: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/70.jpg)
@nathan_gs#DV13-#rtbigdata
Lambda Architecture
![Page 71: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/71.jpg)
@nathan_gs#DV13-#rtbigdata
Lambda Architecture
Can discard any view, batch and real time, and just recreate everything
from the master data.
![Page 72: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/72.jpg)
@nathan_gs#DV13-#rtbigdata
Lambda Architecture
Mistakes are corrected via recomputation.
Write bad data? Remove the data & recompute.
Bug in view generation? Just recompute the view.
![Page 73: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/73.jpg)
@nathan_gs#DV13-#rtbigdata
Lambda Architecture
Data storage is highly optimized.
![Page 74: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/74.jpg)
@nathan_gs#DV13-#rtbigdata
Lambda Architecture
Immutability changes everything.
![Page 75: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/75.jpg)
@nathan_gs#DV13-#rtbigdata
Questions?
Questions?@nathan_gs & #DV13
slideshare.net/nathan_gs
![Page 76: a real-time architecture using Hadoop and Storm at Devoxx](https://reader038.vdocuments.net/reader038/viewer/2022110113/541023e88d7f72aa0e8b45f4/html5/thumbnails/76.jpg)
@nathan_gs#DV13-#rtbigdata
DataCrunchers
We enable companies in envisioning, defining and implementing a data strategy.
A one-stop-shop for all your Big Data needs.
The first Big Data Consultancy agency in Belgium.