aws kinesis
TRANSCRIPT
AWS KINESISSCRATCHING THE SURFACE OF
QUESTION TIME
QUESTION
What (Kafka?) cluster do you needto ingest:
- 10,000 record/sec
- record size 512Bytes
KINESIS
AWS KINESIS
WHAT IS KINESIS?
▸ Platform for streaming data on AWS
▸ "sometimes TBs per hour"
AWS KINESIS
MOTIVATION
▸ Highly Scalable
▸ Durable
▸ Elastic
▸ Replay-able reads
AWS KINESIS
KINESIS COMPONENTS
Firehose AnalyticsStreams
KINESIS STREAMS
AWS KINESIS STREAMS
AWS KINESIS STREAMS
TERMINOLOGY
▸ Streams - ordered sequence of data records
▸ Data record - Sequence Number, Partition Key, Data Blob
▸ 1MB max
▸ Retention period - 24h - 7d
▸ Producers, Consumers
▸ Shards
AWS KINESIS STREAMS
KINESIS STREAMS - SHARDS
▸ Fixed unit of capacity
▸ Read
▸ 5 transaction / sec
▸ 2MB / sec
▸ Write
▸ 1000 records / sec
▸ 1MB / sec
QUESTION
What cluster do you need to ingest
- 10,000 records/sec
- record size 512Bytes
AWS KINESIS STREAMS
KINESIS STREAMS - SHARD CALCULATION
Requirement Kinesis Stream Write Capacity
10,000 records/sec 1,000 Record/sec
512 Bytes/rec (5MB/sec) 1MB/sec
AWS KINESIS STREAMS
KINESIS STREAMS - SHARD CALCULATION
▸ 10 Shards
▸ 10,000 records / sec
▸ 10MB / sec
DEMO
AWS KINESIS STREAMS
DEMO
▸ Create a stream with 1 shard
▸ Put 10 records / sec
▸ Read in batches every sec
{ RecordId: 11, KeyPressCount: 22, UserId: 33 }
KINESIS FIREHOSE
AWS KINESIS FIREHOSE
AWS KINESIS FIREHOSE
TERMINOLOGY
▸ Firehose delivery stream
▸ record - 1MB max
▸ data producer
▸ buffer size, buffer interval
AWS KINESIS FIREHOSE
DATA DELIVERY
▸ KINESIS STREAM
▸ S3
▸ Redshift
▸ Elasticsearch
DEMO
KINESIS ANALYTICS
AWS KINESIS ANALYTICS
AWS KINESIS ANALYTICS
AWS KINESIS ANALYTICS
TERMINOLOGY
▸ Input
▸ Application Code
▸ In-App-Streams
▸ Pumps
▸ Streaming SQL
▸ Output
AWS KINESIS ANALYTICS
STREAMING SQL
▸ Tumbling Window
[...] GROUP BY FLOOR((“SOURCE_SQL_STREAM_001”.ROWTIME – TIMESTAMP ‘1970-01-01 00:00:00’) SECOND / 10 TO SECOND)
▸ Sliding Window
SELECT AVG(change) OVER W1 as avg_change FROM "SOURCE_SQL_STREAM_001" WINDOW W1 AS (PARTITION BY ticker_symbol RANGE INTERVAL '10' SECOND PRECEDING)
AWS KINESIS ANALYTICS
STREAMING SQL - TUMBLING WINDOW
DEMO
AWS KINESIS ANALYTICS
DEMO
▸ Check BDMeetup Application in AWS Console
▸ Producer / Consumer
PRODUCERAPP BDMEETUP STREAM ANALYTICS
APPBDMEETUP-OUTPUT
STREAMCONSUMER
APP
THANK YOU