aws kinesis

30
AWS KINESIS SCRATCHING THE SURFACE OF

Upload: szilveszter-molnar

Post on 22-Jan-2018

188 views

Category:

Software


3 download

TRANSCRIPT

Page 1: Aws Kinesis

AWS KINESISSCRATCHING THE SURFACE OF

Page 2: Aws Kinesis

QUESTION TIME

Page 3: Aws Kinesis

QUESTION

What (Kafka?) cluster do you needto ingest:

- 10,000 record/sec

- record size 512Bytes

Page 4: Aws Kinesis

KINESIS

Page 5: Aws Kinesis

AWS KINESIS

WHAT IS KINESIS?

▸ Platform for streaming data on AWS

▸ "sometimes TBs per hour"

Page 6: Aws Kinesis

AWS KINESIS

MOTIVATION

▸ Highly Scalable

▸ Durable

▸ Elastic

▸ Replay-able reads

Page 7: Aws Kinesis

AWS KINESIS

KINESIS COMPONENTS

Firehose AnalyticsStreams

Page 8: Aws Kinesis

KINESIS STREAMS

Page 9: Aws Kinesis

AWS KINESIS STREAMS

Page 10: Aws Kinesis

AWS KINESIS STREAMS

TERMINOLOGY

▸ Streams - ordered sequence of data records

▸ Data record - Sequence Number, Partition Key, Data Blob

▸ 1MB max

▸ Retention period - 24h - 7d

▸ Producers, Consumers

▸ Shards

Page 11: Aws Kinesis

AWS KINESIS STREAMS

KINESIS STREAMS - SHARDS

▸ Fixed unit of capacity

▸ Read

▸ 5 transaction / sec

▸ 2MB / sec

▸ Write

▸ 1000 records / sec

▸ 1MB / sec

Page 12: Aws Kinesis

QUESTION

What cluster do you need to ingest

- 10,000 records/sec

- record size 512Bytes

Page 13: Aws Kinesis

AWS KINESIS STREAMS

KINESIS STREAMS - SHARD CALCULATION

Requirement Kinesis Stream Write Capacity

10,000 records/sec 1,000 Record/sec

512 Bytes/rec (5MB/sec) 1MB/sec

Page 14: Aws Kinesis

AWS KINESIS STREAMS

KINESIS STREAMS - SHARD CALCULATION

▸ 10 Shards

▸ 10,000 records / sec

▸ 10MB / sec

Page 15: Aws Kinesis

DEMO

Page 16: Aws Kinesis

AWS KINESIS STREAMS

DEMO

▸ Create a stream with 1 shard

▸ Put 10 records / sec

▸ Read in batches every sec

{ RecordId: 11, KeyPressCount: 22, UserId: 33 }

Page 17: Aws Kinesis

KINESIS FIREHOSE

Page 18: Aws Kinesis

AWS KINESIS FIREHOSE

Page 19: Aws Kinesis

AWS KINESIS FIREHOSE

TERMINOLOGY

▸ Firehose delivery stream

▸ record - 1MB max

▸ data producer

▸ buffer size, buffer interval

Page 20: Aws Kinesis

AWS KINESIS FIREHOSE

DATA DELIVERY

▸ KINESIS STREAM

▸ S3

▸ Redshift

▸ Elasticsearch

Page 21: Aws Kinesis

DEMO

Page 22: Aws Kinesis

KINESIS ANALYTICS

Page 23: Aws Kinesis

AWS KINESIS ANALYTICS

Page 24: Aws Kinesis

AWS KINESIS ANALYTICS

Page 25: Aws Kinesis

AWS KINESIS ANALYTICS

TERMINOLOGY

▸ Input

▸ Application Code

▸ In-App-Streams

▸ Pumps

▸ Streaming SQL

▸ Output

Page 26: Aws Kinesis

AWS KINESIS ANALYTICS

STREAMING SQL

▸ Tumbling Window

[...] GROUP BY FLOOR((“SOURCE_SQL_STREAM_001”.ROWTIME – TIMESTAMP ‘1970-01-01 00:00:00’) SECOND / 10 TO SECOND)

▸ Sliding Window

SELECT AVG(change) OVER W1 as avg_change FROM "SOURCE_SQL_STREAM_001" WINDOW W1 AS (PARTITION BY ticker_symbol RANGE INTERVAL '10' SECOND PRECEDING)

Page 27: Aws Kinesis

AWS KINESIS ANALYTICS

STREAMING SQL - TUMBLING WINDOW

Page 28: Aws Kinesis

DEMO

Page 29: Aws Kinesis

AWS KINESIS ANALYTICS

DEMO

▸ Check BDMeetup Application in AWS Console

▸ Producer / Consumer

PRODUCERAPP BDMEETUP STREAM ANALYTICS

APPBDMEETUP-OUTPUT

STREAMCONSUMER

APP

Page 30: Aws Kinesis

THANK YOU