aws streaming data solution for amazon kinesis ......$0.02 total: $129.73 / month option 3 the...

41
AWS Streaming Data Solution for Amazon Kinesis Implementation Guide

Upload: others

Post on 15-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming DataSolution for Amazon Kinesis

Implementation Guide

Page 2: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

AWS Streaming Data Solution for Amazon Kinesis: ImplementationGuideCopyright © Amazon Web Services, Inc. and/or its affiliates. All rights reserved.

Amazon's trademarks and trade dress may not be used in connection with any product or service that is notAmazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages ordiscredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who mayor may not be affiliated with, connected to, or sponsored by Amazon.

Page 3: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Table of ContentsWelcome .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Cost ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Option 1 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Option 2 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Option 3 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Option 4 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Architecture .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Option 1 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Option 2 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Option 3 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Option 4 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Components .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Option 1 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Demo consumer application .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10CloudWatch dashboards and alerts ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Option 2 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Demo producer application .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Demo consumer application .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11CloudWatch dashboards and alerts ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Option 3 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13CloudWatch dashboards and alerts ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Option 4 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Demo producer application .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Demo consumer application .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15CloudWatch dashboards and alerts ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Custom resources .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Templates .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Deployment .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Prerequisites ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Option 1 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Launch the Stack .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Option 2 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Deployment overview .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Step 1. Launch the Stack .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Step 2. Post-configuration steps .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Option 3 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Launch the Stack .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Option 4 .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Deployment overview .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Step 1. Launch the Stack .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Step 2. Post-configuration steps .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Security ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28IAM roles .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Security groups .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Auditing .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Resources .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Kinesis Producer Library .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Uninstall the solution .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Using the AWS Management Console .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Using AWS Command Line Interface .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Deleting the Amazon S3 Buckets ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Deleting the CloudWatch Logs .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Deleting the Amazon Cognito User Pool ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Operational metrics ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

iii

Page 4: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Source code .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Contributors ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Revisions .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Notices .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

iv

Page 5: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Deployment framework forcapturing, storing, processing, anddelivering real-time streaming data

AWS Implementation Guide

AWS Solutions Builder Team

August 2020 (last update (p. 36): July 2021)

The AWS Streaming Data Solution for Amazon Kinesis enables you to capture, store, process, and deliverreal-time streaming data. By automatically configuring the included AWS services, this solution helps youaddress real-time streaming use cases, for example:

• Capture high volume application log files• Analyze website clickstreams• Process database event streams• Track financial transactions• Aggregate social media feeds• Collect IT log files• Continuously deliver to a data lake

This solution helps accelerate your development lifecycle by minimizing or eliminating the need tomodel and provision resources using AWS CloudFormation, setup preconfigured Amazon CloudWatchalarms set to recommended thresholds, dashboards, and logging, and manually implement streamingdata best practices. This solution is data and logic agnostic, meaning that you can start with boilerplatecode and then customize it to your needs.

The solution uses templates where data flows through producers, streaming storage, consumers, anddestinations. Producers continuously generate data and send it to streaming storage where it is durablycaptured and made available for processing by a data consumer. Data consumers process the data andthen send it to a destination.

To support multiple use cases and business needs, this solution offers four AWS CloudFormationtemplates. You can use this solution to test new service combinations as the basis for your productionenvironment, and to improve existing applications.

1. Option 1 captures data from non-AWS environments (for example, mobile clients). This optionuses an Amazon API Gateway as a layer of abstraction, which enables you to implement customauthentication approaches for data producers, control quotas for specific producers, and change thetarget Kinesis stream. This template uses AWS Lambda as the data consumer, which is best suited foruse cases that don’t require internal state like filtering, business event processing, and data cleansing.AWS Lambda offers a small surface area for error scenarios and is simple to scale and operate.

2. Option 2 is intended for use cases such as streaming extract-transform-load (ETL), real-time analytics,predictive analytics, and machine learning. It uses Apache Flink and provides a fully managedservice to handle backups for snapshots, a Kinesis Data Analytics implementation of an Apache FlinkSavepoint, automatically. This option also supports the Amazon Kinesis Producer Library (KPL), whichis best suited when you control the code that is written to the Kinesis Data Streams. This control

1

Page 6: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

enables you to implement cost optimizations through buffering on the data producer and has fewerresources to manage compared to alternatives. Kinesis Data Analytics for Apache Flink is used asthe data consumer, which is best suited when you require capabilities such as durable applicationand exactly-once processing, that are very efficient processes for high volume data streams with lowlatency and high availability.

3. Option 3 uses Amazon Kinesis Data Firehose. Use this option when you want a simple way to backup incoming streaming data with minimal administration for the processing layer and ability to senddata into Amazon Simple Storage Service (among other destinations) in near real time. Kinesis DataFirehose takes care of compression and encryption, minimizing the amount of storage used at thedestination and increasing security.

4. Option 4 uses Apache Flink, and showcases how to asynchronously invoke an external endpoint in astreaming application, for example, when you want to enrich or filter incoming events. The externalAPI can be any integration supported by API Gateway, such as a Lambda function or an AmazonSageMaker endpoint.

All templates are configured to apply best practices to monitor functionality using dashboards andalarms, and to secure data

This implementation guide discusses architectural considerations and configuration steps for deployingthe AWS Streaming Data Solution for Amazon Kinesis in the Amazon Web Services (AWS) Cloud. Itincludes links to AWS CloudFormation templates that launches and configures the AWS services requiredto deploy this solution using AWS best practices for security and availability.

The guide is intended for IT architects, developers, and DevOps professionals who want to get startedquickly with the core streaming services available in the AWS Cloud.

2

Page 7: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 1

CostYou are responsible for the cost of the AWS services used while running this solution. At the date ofpublication, the monthly cost for running this solution with either provided AWS CloudFormationtemplate options, while publishing 100 records per second in the US East (N. Virginia) Region, isdescribed in the following tables.

Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used inthis solution.

Option 1The Option 1 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambda AWS CloudFormation template that uses Amazon APIGateway, Amazon Kinesis Data Streams (with enhanced monitoring disabled), and AWS Lambda.

Table for Option 1: Cost estimate for running the solution using the CloudFormation template thatdeploys API Gateway, Kinesis Data Streams, and AWS Lambda

AWS service Dimensions Total cost (per month)

Amazon Kinesis Data Streams 1 shard

100 records (4 KB) / second

168 hours data retention

$10.95

$3.68

$14.60

Amazon API Gateway 2,678,400 requests / month (1/sec)

$10.50

Amazon Cognito 100 users / month

Advanced security featuresadded

$5.00

AWS Lambda 2,678,400 requests / month (1/sec)

$3.33

TOTAL: $48.06 / month

Option 2The Option 2 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics AWS CloudFormation template that usesAmazon Elastic Compute Cloud (Amazon EC2), KPL, Kinesis Data Streams, Kinesis Data Analytics, andAmazon CloudWatch.

Table for Option 2: Cost estimate for running the solution using the CloudFormation template thatdeploys Amazon EC2, KPL, Kinesis Data Streams, Kinesis Data Analytics, and CloudWatch

3

Page 8: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 3

AWS service Dimensions Total cost (per month)

Amazon Kinesis Producer Library

(KPL)

EC2 instance (t3.small)

730 hours / month

$15.18

Kinesis Data Streams 1 shard

100 records (4 KB) / second

168 hours data retention

$10.95

$3.68

$14.60

Kinesis Data Analytics 1 processing unit

50 GB running applicationstorage

$80.30

$5.00

Amazon S3 1 GB storage (Amazon S3Standard)

$0.02

TOTAL: $129.73 / month

Option 3The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3 AWS CloudFormation template thatuses Amazon Kinesis Data Streams (with enhanced monitoring disabled), Amazon Kinesis Data Firehose(with data transformation disabled), and Amazon S3.

Table for Option 3: Cost estimate for running the solution using the AWS CloudFormation templatethat deploys Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon S3

AWS service Dimensions Total cost (per month)

Amazon Kinesis Data Streams 1 shard

100 records (4KB) / second

168 hours data retention

$10.95

$3.68

$14.60

Amazon Kinesis Data Firehose 100 records (4 KB) / second $36.34

Amazon S3 1 GB storage (Amazon S3Standard)

$0.02

TOTAL: $65.59 / month

Option 4The Option 4 table provides a cost estimate to deploy the AWS CloudFormation template that usesAmazon Kinesis Data Streams, Kinesis Data Analytics, and API Gateway.

Table for Option 4: Cost estimate for running the solution using the AWS CloudFormation templatethat deploys Amazon Kinesis Data Streams, Kinesis Data Analytics, and API Gateway

4

Page 9: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 4

AWS service Dimensions Total cost (per month)

Amazon Kinesis Replay EC2 instance (t3.small)

730 hours / month

$15.18

Amazon Kinesis Data Streams 1 shard

100 records (4KB) / second

168 hours data retention

$10.95

$3.68

$14.60

Amazon Kinesis Data Analytics 1 processing unit

50 GB running applicationstorage

$80.30

$5.00

Amazon API Gateway 2,678,400 requests /month (1 /second)

$10.50

TOTAL: $140.21 / month

5

Page 10: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 1

Architecture overviewThis solution automatically configures the core AWS services necessary to capture, store, process,and deliver streaming data. Four reference architecture options are available. Each option includescustomizable demo code and sample applications.

All AWS CloudFormation resources were created using AWS Solutions Constructs.

Option 1Deploying the aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambdaAWS CloudFormation template builds the following environment in the AWS Cloud.

Figure 1: AWS CloudFormation template using Amazon API Gateway,Kinesis Data Streams, and AWS Lambda reference architecture

This AWS CloudFormation template deploys a reference architecture that includes the following:

1. An API Gateway REST API that acts as a proxy to Amazon Kinesis Data Streams, adding either anindividual data record or a list of data records.

2. An Amazon Cognito user pool is used to control who can invoke REST API methods.

3. Kinesis Data Streams to store the incoming streaming data.

4. An AWS Lambda function processes the records from the data stream.

5. Errors and failed records that occur during AWS Lambda processing are annotated, and the events arestored in Amazon Simple Queue Service (Amazon SQS).

6. An Amazon SQS queue stores metadata for failed batch records and Lambda errors, allowingcustomers to retrieve these records and determine the next steps to resolve them.

NoteFor information about how AWS services handle errors, refer to Error handling in the AWSLambda Developer Guide and AWS Lambda Supports Failure-Handling Features.

6

Page 11: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 2

Option 2Deploying the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics AWS CloudFormation template builds the following environment in the AWS Cloud.

Figure 2: AWS CloudFormation template using Amazon EC2, KPL, KinesisData Streams, Kinesis Data Analytics, and CloudWatch reference architecture

This AWS CloudFormation template deploys a reference architecture that includes the following:

1. An Amazon Elastic Compute Cloud (Amazon EC2) instance that uses the Amazon Kinesis ProducerLibrary (KPL) to generate data.

2. Kinesis Data Streams to store the incoming streaming data.

3. Kinesis Data Analytics processes the incoming records and saves the processed data in an AmazonSimple Storage Service (Amazon S3) bucket.

4. An Amazon CloudWatch dashboard monitors application health, progress, resource utilization, events,and errors. For information on essential metrics including recommended alarms, refer to UsingCloudWatch Alarms with Amazon Kinesis Data Analytics for Apache Flink in the Kinesis Data AnalyticsDeveloper Guide.

NoteYou can configure the solution to use an existing Amazon Virtual Private Cloud (Amazon VPC)to allow access to private resources (for example, private databases). By default, Kinesis DataAnalytics does not launch into your VPC, but you can enable your applications to access AmazonRelational Database Service (Amazon RDS) using this solution. For information about connectingto a virtual private cloud in your account, refer to Configuring Kinesis Data Analytics in theKinesis Data Analytics Developer Guide.

Option 3Deploying the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3 AWS CloudFormation template builds the following environment in the AWS Cloud.

7

Page 12: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 4

Figure 3: AWS CloudFormation template using Kinesis DataStreams, Kinesis Data Firehose, and S3 reference architecture

This AWS CloudFormation template deploys a reference architecture that includes the following:

1. Kinesis Data Streams stores the incoming streaming data.

2. Kinesis Data Firehose buffers the data before delivering the output to an Amazon S3 bucket. It is afully managed service that automatically scales to match the throughput of your data and requires noongoing administration.

3. An Amazon CloudWatch dashboard monitors the data ingestion and buffering. CloudWatch alarmsare set on essential metrics for Kinesis Data Firehose. For information on essential metrics (includingrecommended alarms), refer to Monitoring Kinesis Data Firehose Using CloudWatch Metrics in theKinesis Data Firehose Developer Guide.

Option 4Deploying the aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gateway AWS CloudFormation template builds the following environment in theAWS Cloud.

8

Page 13: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 4

Figure 4: AWS CloudFormation template using Kinesis Data Streams,Kinesis Data Analytics, and API Gateway reference architecture

This AWS CloudFormation template deploys a reference architecture similar to Option 2 and includes thefollowing:

1. An Amazon Elastic Compute Cloud (Amazon EC2) instance that uses the Amazon Kinesis ProducerLibrary (KPL) to generate data.

2. Kinesis Data Streams stores the incoming streaming data.3. Kinesis Data Analytics processes the incoming records and asynchronously invokes an external

endpoint.4. The demo application invokes a Lambda function.5. The external API can be any integration supported by API Gateway (for example, an Amazon

SageMaker endpoint).6. An Amazon CloudWatch dashboard monitors application health, progress, resource utilization, events,

and errors. For information on essential metrics including recommended alarms, refer to UsingCloudWatch Alarms with Amazon Kinesis Data Analytics for Apache Flink in the Amazon Kinesis DataAnalytics Developer Guide.

NoteYou can configure the solution to use an existing Amazon Virtual Private Cloud (Amazon VPC)allowing access to private resources (for example, private databases). By default, Kinesis DataAnalytics does not launch into your VPC, but you can enable your applications to access AmazonRelational Database Service (Amazon RDS) using this solution. For information about connectingto a virtual private cloud in your account, refer to Configuring Kinesis Data Analytics in theKinesis Data Analytics Developer Guide.

9

Page 14: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 1

Solution components

Component details for all templates.

Option 1Components for Option 1: API Gateway, Kinesis Data Streams, and AWS Lambda

Demo consumer applicationA consumer is an application that processes data from a Kinesis data stream. The aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambda template includes a democonsumer application, which is a Node.js function that logs the data being published. The customizablesource code is available from the solution’s GitHub repository and can be customized to your businessneeds.

CloudWatch dashboards and alertsThis template deploys an Amazon CloudWatch dashboard that monitors the health of the data stream.You can customize the dashboards and alerts using Amazon CloudWatch or the source code from thesolution's GitHub repository.

Figure 5: Kinesis Data Streams metrics on the CloudWatch dashboard

10

Page 15: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 2

Figure 6: AWS Lambda metrics on the CloudWatch dashboard

Option 2Components for Option 2: Amazon EC2, KPL, Kinesis Data Streams, and Kinesis Data Analytics

Demo producer applicationA producer is an application that puts user data records into a Kinesis data stream (also called dataingestion). The aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics AWS CloudFormation template includes a demo producer application, which is implementedusing the Amazon Kinesis Producer Library (KPL). The demo producer application is configured to write100 records per second to the data stream. The customizable source code is available from the solution’sGitHub repository. For information about the customizing the demo producer application, or replacing itwith your own application, refer to the KPL demo README.md file in the GitHub repository.

NoteThis demo producer application uses the same schema that is provided in Getting Started withAmazon Kinesis Data Analytics for Apache Fink (DataStream API) in the Amazon Kinesis DataAnalytics Developer Guide.

Demo consumer applicationA consumer is an application that processes data from a Kinesis data stream. This template includesa demo consumer application, which is a Java application for Amazon Kinesis Data Analytics. Thisapplication demonstrates how to use Apache Flink sources, sinks, and operators. The customizablesource code is available from the solution’s GitHub repository. For information about customizing thedemo consumer application, or replacing it with your own application, refer to the Kinesis Data Analyticsand Apache Flink demo README.md file in the GitHub repository.

For more examples on applications in Kinesis Data Analytics, refer to Kinesis Data Analytics for ApacheFlink Examples in the Kinesis Data Analytics Developer Guide.

CloudWatch dashboards and alertsThis template deploys an Amazon CloudWatch dashboard that monitors the health of the data stream.You can also view Kinesis Data Analytics Log Insights statistics.

You can customize the dashboards and alerts using Amazon CloudWatch or the source code from thesolution’s GitHub repository.

Figure 7: Application Health on the CloudWatch dashboard

11

Page 16: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

CloudWatch dashboards and alerts

Figure 8: Resource Utilization on the CloudWatch dashboard

Figure 9: Fink Application Progress on the CloudWatch dashboard

Figure 10: Kinesis Source Metrics on the CloudWatch dashboard

12

Page 17: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 3

Option 3Components for Option 3: Kinesis Data Streams, Kinesis Data Firehose, and S3

CloudWatch dashboards and alertsThis template deploys an Amazon CloudWatch dashboard to monitor the health of the data ingestionand buffering. You can customize the dashboards and alerts using Amazon CloudWatch or the sourcecode from the solution’s GitHub repository.

Figure 11: Kinesis Data Firehose metrics on the Amazon CloudWatch dashboard

13

Page 18: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 4

Figure 12: Logs Insights on the CloudWatch dashboard

Option 4Components for Option 4: Amazon Kinesis Data Streams, Kinesis Data Analytics, and API Gateway

Demo producer applicationThe aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gateway AWS CloudFormation template includes a demo producer application, that isimplemented using the Amazon Kinesis Replay project. By default, the application replays an historicdataset of New York City taxi trips derived from the public data set available from the Registry of OpenData on AWS.

14

Page 19: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Demo consumer application

Demo consumer applicationThis template also includes a demo consumer application, which is a Java application for AmazonKinesis Data Analytics. This application demonstrates how to invoke an external API in your streamingapplication.

CloudWatch dashboards and alertsThis solution deploys an Amazon CloudWatch dashboard to monitor the health, progress, resourceutilization, and specific events and errors of the data streaming activities. You can also view LogInsights statistics on the CloudWatch dashboard, which provides information on Kinesis Data Analyticsapplication logs.

This dashboard visualizations for this template are identical to the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics template (Option 2).

Custom resourcesThe AWS CloudFormation templates provided in this solution support enhanced monitoring for AmazonKinesis Data Streams. When enhanced monitoring is enabled, Kinesis Data Streams sends shard-leveldata to Amazon CloudWatch. Additional costs may apply.

The aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analyticsand aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gateway AWS CloudFormation templates use the VPC configuration capability in Kinesis DataAnalytics.

These features are not currently supported in Amazon Kinesis services, therefore, the solution providesAWS Lambda functions that implement custom resources that enable these features.

15

Page 20: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

AWS CloudFormation templatesThis solution uses AWS CloudFormation to automate the deployment of the AWS Streaming DataSolution for Amazon Amazon Kinesis in the AWS Cloud. It includes the following CloudFormationtemplates, which you can download before deployment and customize to meet your needs:

Option 1: aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambda.template: Use this template to launch this solution using Amazon API Gateway, Amazon KinesisData Streams, and AWS Lambda. The default configuration deploys API Gateway REST APIs, KinesisVideo Streams, AWS Lambda functions, and an Amazon Simple Queue Service queue. You can alsocustomize the template based on your specific needs.

Option 2: aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics.template: Use this template to launch this solution using Kinesis Producer Library, AmazonElastic Compute Cloud (Amazon EC2), and Amazon Kinesis Data Analytics. The default configurationdeploys an Amazon EC2 instance, Kinesis Data Streams, Kinesis Data Analytics, Amazon CloudWatch, andan Amazon Simple Storage Service bucket. You can also customize the template based on your specificneeds.

Option 3: aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3.template: Use this template to launch the solution using Kinesis Data Streams, KinesisData Firehose, and S3. You can also customize the template for your specific needs.

Option 4: aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gateway.template: Use this template to launch the solution using Kinesis Data Streams, KinesisData Analytics, and API Gateway. You can also customize the template for your specific needs.

16

Page 21: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Prerequisites

Automated deployment

PrerequisitesChoose one of the following AWS CloudFormation templates to deploy, then follow the step-by-stepinstructions for your selected template:

• Option 1: Deploy the aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambda.template AWS CloudFormation template using Amazon API Gateway, Amazon KinesisData Streams, and AWS Lambda

• Option 2: Deploy the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics.template AWS CloudFormation template using Amazon ElasticCompute Cloud (Amazon EC2), Amazon Kinesis Producer Library (KPL), Kinesis Data Streams, AmazonKinesis Data Analytics, and Amazon CloudWatch

• Option 3: Deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3.template AWS CloudFormation template using Kinesis Data Streams,Kinesis Data Firehose, and S3

• Option 4: Deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gateway.template AWS CloudFormation template using Kinesis DataStreams, Kinesis Data Analytics, and Amazon API Gateway

Option 1Option 1: Deploy the aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambdaCloudFormation template

Before you launch this template, review the architecture and other considerations discussed in thisguide. Follow the step-by-step instructions in this section to configure and deploy the solution into youraccount.

Time to deploy: Approximately five minutes

Launch the StackNoteYou are responsible for the cost of the AWS services used while running this solution. Refer tothe Cost section for more details. For full details, refer to the pricing webpage for each AWSservice used in this solution.

1. Sign in to the AWS Management Console and use the button below to launch the aws-streaming-data-solution-for-kinesis-using-api-gateway-and-lambda AWS CloudFormationtemplate.

Alternatively, you can download the template as a starting point for your own implementation.2. The template launches in the US East (N. Virginia) Region by default. To launch this solution in a

different AWS Region, use the Region selector in the console navigation bar.

17

Page 22: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Launch the Stack

3. On the Create stack page, verify that the correct template URL shows in the Amazon S3 URL textbox and choose Next.

4. On the Specify stack details page, assign a name to your solution stack. For informationabout naming character limitations, refer to IAM and STS Limits in the AWS Identity and AccessManagement User Guide.

5. Under Parameters, review the parameters for the template and modify them as necessary. Thissolution uses the following default values.

Parameter Default Description

Amazon API Gateway configuration

Steady-state requests persecond

(ThrottlingRateLimit)

100 The number of steady-staterequests per second that theproxy API Gateway permits.The allowed range is from 1 to10000.

Burst requests per second

(ThrottlingBurstLimit)

50 The number of burst requestsper second that the proxy APIGateway permits. The allowedrange is from 0 to 5000.

Amazon Kinesis Data Streams configuration

Number of open shards

(ShardCount)

2 The number of shards that thestream uses. The allowed rangeis from 1 to 100 shards.

Data retention period (hours)

(RetentionHours)

24 The number of hours thatdata records stored in shardswill remain accessible. Theallowed range is from 24 to8760 hours.

Enable enhanced (shard-level)metrics

(EnableEnhancedMonitoring)

false Choose whether to enableenhanced monitoring forshard-level metrics. Thisfunction is disabled by default.

AWS Lambda consumer configuration

Largest number of recordsthat will be read from thestream at once

(BatchSize)

100 The maximum number ofrecords to retrieve in a singlebatch. The allowed range isfrom 1 to 10000.

Number of batches to processfrom each shard concurrently

(ParallelizationFactor)

1 The number of batches toprocess from each shardconcurrently. The allowedrange is from 1 to 10.

Maximum number of timesto retry when the functionreturns an error

(MaxRetryAttempts)

3 The maximum number ofretries when the AWS Lambdafunction returns an error.

18

Page 23: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 2

6. Choose Next.7. On the Configure stack options page, choose Next.8. On the Review page, review and confirm the settings. Check the box acknowledging that the

template will create AWS Identity and Access Management (IAM) resources.9. Choose Create stack to deploy the stack.

You can view the status of the stack in the AWS CloudFormation console in the Status column. Youshould receive a CREATE_COMPLETE status in approximately five minutes.

You can customize or replace the demo application that is included with this solution can be customizedor replaced to meet your business needs. The source code is available from the solution’s GitHubrepository. For information about the demo producer application and customizing the demo applicationor replacing it with your own application, refer to the README.md file in the GitHub repository.

NoteThis solution includes the solution-helper Lambda function, which runs only during initialconfiguration or when resources are updated or deleted. When you run this solution, you willnotice the Lambda functions in the AWS Management Console. While it may not appear active,do not delete the solution-helper function because it is necessary to manage associatedresources.

Option 2Option 2: Deploy the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analyticsCloudFormation template

Before you launch this template, review the architecture and other considerations discussed in thisguide. Follow the step-by-step instructions in this section to configure and deploy the solution into youraccount.

Time to deploy: Approximately 10 minutes

Deployment overviewUse the following steps to deploy this solution on AWS. For detailed instructions, follow the links foreach step.

Step 1. Launch the Stack (p. 19)

• Launch the AWS CloudFormation template into your AWS account.• Review the other template parameters, and adjust if necessary.

Step 2. Post-configuration steps (p. 22)

Step 1. Launch the StackNoteYou are responsible for the cost of the AWS services used while running this solution. Refer tothe Cost section for more details. For full details, refer to the pricing webpage for each AWSservice used in this solution.

1. Sign in to the AWS Management Console and use the button below to launch the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics AWSCloudFormation template.

19

Page 24: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Step 1. Launch the Stack

Alternatively, you can download the template as a starting point for your own implementation.

2. The template launches in the US East (N. Virginia) Region by default. To launch this solution in adifferent AWS Region, use the Region selector in the console navigation bar.

NoteThis template uses Amazon Kinesis Data Analytics, which is not currently available in allAWS Regions. You must launch this solution in an AWS Region where Kinesis Data Analyticsis available. For the most current availability by Region, refer to the AWS Service RegionTable.

3. On the Create stack page, verify that the correct template URL shows in the Amazon S3 URL textbox and choose Next.

4. On the Specify stack details page, assign a name to your solution stack. For informationabout naming character limitations, refer to IAM and STS Limits in the AWS Identity and AccessManagement User Guide.

5. Under Parameters, review the parameters for the template and modify them as necessary. Thissolution uses the following default values.

Parameter Default Description

Amazon Kinesis Producer Library (KPL) configuration

VPC where the KPL instanceshould be launched

(ProducerVpcId)

<Requires input> VPC where the KPL instance islaunched.

Subnet where the KPLinstance should be launched

(ProducerSubnetId)

<Requires input> Subnet where the KPL instanceis launched (the subnetrequires access to Kinesis DataStreams, either via IGW orNAT).

Amazon Machine Image forthe KPL instance

(ProducerAmId)

/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2

Amazon Machine Image (AMI)ID for the KPL instance.

Amazon Kinesis Data Streams configuration

Number of open shards

(ShardCount)

2 The number of shards that thestream uses. The allowed rangeis from 1 to 100 shards.

Data retention period (hours)

(RetentionHours)

24 The number of hours thatdata records stored in shardswill remain accessible. Theallowed range is from 24 to8760 hours.

Enable enhanced (shard-level)metrics

false Choose whether to enableenhanced monitoring for

20

Page 25: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Step 1. Launch the Stack

Parameter Default Description

(EnableEnhancedMonitoring) shard-level metrics. Thisfunction is disabled by default.

Amazon Kinesis Data Analytics configuration

Monitoring log level

(LogLevel)

INFO The level of detail of theCloudWatch Logs for anapplication. The availableoptions include DEBUG,ERROR, INFO, and WARN. Forinformation about choosing alog level, refer to ApplicationMonitoring Levels in theAmazon Kinesis Data AnalyticsDeveloper Guide.

Monitoring metrics level

(MetricsLevel)

TASK The granularity of theCloudWatch Logs for anapplication. The availableoptions include APPLICATION,OPERATOR, PARALLELISM,and TASK. For informationabout choosing a metrics level,refer to Application MonitoringLevels in the Amazon KinesisData Analytics Developer Guide.

Enable service-triggeredsnapshots

(EnableSnapshots)

true Choose whether snapshots areenabled for an application.

Enable automatic scaling

(EnableAutoScaling)

true Choose whether the KinesisData Analytics service canincrease the parallelism ofthe application in response toincreased throughput.

Comma-separated listof subnet ids for VPCconnectivity

(ApplicationSubnetIds)

<Optional input> If subnet IDs are provided, thensecurity groups must also beincluded.

Comma-separated list ofsecurity groups ids for VPCconnectivity

(ApplicationSecurityGroupIds)

<Optional input> If security group IDs areprovided, then subnets mustalso be included.

6. Choose Next.

7. On the Configure stack options page, choose Next.

8. On the Review page, review and confirm the settings. Check the box acknowledging that thetemplate will create AWS Identity and Access Management (IAM) resources.

9. Choose Create stack to deploy the stack.

21

Page 26: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Step 2. Post-configuration steps

You can view the status of the stack in the AWS CloudFormation console in the Status column. Youshould receive a CREATE_COMPLETE status in approximately 10 minutes.

NoteThis solution includes the solution-helper Lambda function, which runs only during initialconfiguration or when resources are updated or deleted. When you run this solution, you willnotice the Lambda functions in the AWS console. While it may not appear active, do not deletethe solution-helper function because it is necessary to manage associated resources.

Step 2. Post-configuration stepsBy default, the demo producer and demo consumer applications will not run after the stacks are created.Follow these steps to enable them.

1. Sign in to the Amazon Kinesis console and, from the left menu pane, select Analytics applications.2. On the Amazon Kinesis Data Analytics page, select Kda<application-name>.3. Choose Actions then choose Run application.4. Navigate to the AWS Systems Manager console and, from the left menu pane under Instances and

Nodes, select Session Manager.5. On the AWS Systems Manager page, choose Start session.6. On the Start a session page, select the ec2-instance-id> for the KPL instance and choose Start

session.

Refer to the AWS CloudFormation Outputs tab for the Amazon EC2 instance ID.7. In the console window, run the following command to start the demo producer application. (Replace

<stream-name>, <aws-region>, and <seconds> with your specific information).

sudo java -jar /tmp/aws-kpl-demo.jar <stream-name> <aws-region> <seconds>

You can customize or replace the demo application that is included with this solution can be customizedor replaced to meet your business needs. The source code is available from the solution’s GitHubrepository. For information about the demo producer application and customizing the demo applicationor replacing it with your own application, refer to the README.md file in the GitHub repository.

Option 3Option 3: Deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3 CloudFormation template

Before you launch this template, review the architecture and other considerations discussed in thisguide. Follow the step-by-step instructions in this section to configure and deploy the solution into youraccount.

Time to deploy: Approximately five minutes

Launch the StackNoteYou are responsible for the cost of the AWS services used while running this solution. Refer tothe Cost section for more details. For full details, refer to the pricing webpage for each AWSservice used in this solution.

22

Page 27: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Launch the Stack

1. Sign in to the AWS Management Console and use the button below to launch the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3 AWSCloudFormation template.

Alternatively, you can download the template as a starting point for your own implementation.

2. The template launches in the US East (N. Virginia) Region by default. To launch this solution in adifferent AWS Region, use the Region selector in the console navigation bar.

3. On the Create stack page, verify that the correct template URL shows in the Amazon S3 URL textbox and choose Next.

4. On the Specify stack details page, assign a name to your solution stack. For informationabout naming character limitations, refer to IAM and STS Limits in the AWS Identity and AccessManagement User Guide.

5. Under Parameters, review the parameters for the template and modify them as necessary. Thissolution uses the following default values.

Parameter Default Description

Amazon Kinesis Data Streams configuration

Number of open shards

(ShardCount)

2 The number of shards that thestream uses. The allowed rangeis from 1 to 100 shards.

Data retention period

(RetentionHours)

24 The number of hours thatdata records stored in shardswill remain accessible. Theallowed range is from 24 to8760 hours.

Enable enhanced (shard-level)metrics

(EnableEnhancedMonitoring)

false Choose whether to enableenhanced monitoring forshard-level metrics. Thisfunction is disabled by default.

Amazon Kinesis Data Firehose configuration

Size of the buffer (in MBs)that incoming data is bufferedbefore delivery

(BufferingSize)

5 The size to buffer incomingdata before delivering to S3.The allowed range is from 1 to128.

Length of time (in seconds)that incoming data is bufferedbefore delivery

(BufferingInterval)

300 The amount of time tobuffer incoming data beforedelivering to S3. The allowedrange is from 60 to 900.

Compression format fordelivered data in Amazon S3

(CompressionFormat)

GZIP The format of data once it’sdelivered to S3. Allowed valuesare GZIP, HADOOP_SNAPPY,

23

Page 28: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Option 4

Parameter Default Description

Snappy, UNCOMPRESSED, andZIP.

6. Choose Next.7. On the Configure stack options page, choose Next.8. On the Review page, review and confirm the settings. Check the box acknowledging that the

template will create AWS Identity and Access Management (IAM) resources.9. Choose Create stack to deploy the stack.

You can view the status of the stack in the AWS CloudFormation console in the Status column. Youshould receive a CREATE_COMPLETE status in approximately five minutes.

NoteThis solution includes the solution-helper Lambda function, which runs only during initialconfiguration or when resources are updated or deleted. When you run this solution, you willnotice the Lambda functions in the AWS Management Console. While it may not appear active,do not delete the solution-helper function because it is necessary to manage associatedresources.

Option 4Option 4: Deploy aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gatewayCloudFormation template

Before you launch this template, review the architecture and other considerations discussed in thisguide. Follow the step-by-step instructions in this section to configure and deploy the solution into youraccount.

Time to deploy: Approximately 10 minutes

Deployment overviewUse the following steps to deploy this solution on AWS. For detailed instructions, follow the links foreach step.

Step 1. Launch the Stack (p. 24)

• Launch the AWS CloudFormation template into your AWS account.• Review the other template parameters, and adjust if needed.

Step 2. Post-configuration steps (p. 27)

Step 1. Launch the StackNoteYou are responsible for the cost of the AWS services used while running this solution. Refer tothe Cost section for more details. For full details, refer to the pricing webpage for each AWSservice used in this solution.

1. Sign in to the AWS Management Console and use the button below to launch the aws-streaming-data-solution-for-kinesis-using-kinesis-data-analytics-and-api-gateway AWSCloudFormation template.

24

Page 29: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Step 1. Launch the Stack

Alternatively, you can download the template as a starting point for your own implementation.

2. The template launches in the US East (N. Virginia) Region by default. To launch this solution in adifferent AWS Region, use the Region selector in the console navigation bar.

NoteThis template uses Amazon Kinesis Data Analytics, which is not currently available in allAWS Regions. You must launch this solution in an AWS Region where Kinesis Data Analyticsis available. For the most current availability by Region, refer to the AWS Service RegionTable.

3. On the Create stack page, verify that the correct template URL shows in the Amazon S3 URL textbox and choose Next.

4. On the Specify stack details page, assign a name to your solution stack. For informationabout naming character limitations, refer to IAM and STS Limits in the AWS Identity and AccessManagement User Guide.

5. Under Parameters, review the parameters for the template and modify them as necessary. Thissolution uses the following default values.

Parameter Default Description

Amazon Kinesis Replay configuration

VPC where the KPL instanceshould be launched

(ProducerVpcId)

<Requires input> VPC where the KPL instance islaunched.

Subnet where the KPLinstance should be launched

(ProducerSubnetId)

<Requires input> Subnet where the KPL instanceis launched (the subnetrequires access to Kinesis DataStreams, either via IGW orNAT).

Amazon Machine Image forthe KPL instance

(ProducerAmId)

/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2

Amazon Machine Image (AMI)ID for the KPL instance.

Amazon Kinesis Data Streams configuration

Number of open shards

(ShardCount)

2 The number of shards that thestream uses. The allowed rangeis from 1 to 100 shards.

Data retention period (hours)

(RetentionHours)

24 The number of hours thatdata records stored in shardswill remain accessible. Theallowed range is from 24 to8760 hours.

Enable enhanced (shard-level)metrics

false Choose whether to enableenhanced monitoring for

25

Page 30: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Step 1. Launch the Stack

Parameter Default Description

(EnableEnhancedMonitoring) shard-level metrics. Thisfunction is disabled by default.

Amazon Kinesis Data Analytics configuration

Monitoring log level

(LogLevel)

INFO The level of detail of theCloudWatch Logs for anapplication. The availableoptions include DEBUG,ERROR, INFO, and WARN. Forinformation about choosing alog level, refer to ApplicationMonitoring Levels in theAmazon Kinesis Data AnalyticsDeveloper Guide.

Monitoring metrics level

(MetricsLevel)

TASK The granularity of theCloudWatch Logs for anapplication. The availableoptions include APPLICATION,OPERATOR, PARALLELISM,and TASK. For informationabout choosing a metrics level,refer to Application MonitoringLevels in the Amazon KinesisData Analytics Developer Guide.

Enable service-triggeredsnapshots

(EnableSnapshots)

true Choose whether snapshots areenabled for an application.Options are true or false.

Enable automatic scaling

(EnableAutoScaling)

true Choose whether the KinesisData Analytics service canincrease the parallelism ofthe application in response toincreased throughput. Optionsare true or false.

Comma-separated listof subnet ids for VPCconnectivity

(ApplicationSubnetIds)

<Optional input> If subnet IDs are provided, thensecurity groups must also beincluded.

Comma-separated list ofsecurity groups ids for VPCconnectivity

(ApplicationSecurityGroupIds)

<Optional input> If security group IDs areprovided, then subnets mustalso be included.

6. Choose Next.

7. On the Configure stack options page, choose Next.

8. On the Review page, review and confirm the settings. Check the box acknowledging that thetemplate will create AWS Identity and Access Management (IAM) resources.

9. Choose Create stack to deploy the stack.

26

Page 31: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Step 2. Post-configuration steps

You can view the status of the stack in the AWS CloudFormation console in the Status column. Youshould receive a CREATE_COMPLETE status in approximately 10 minutes.

NoteThis solution includes the solution-helper Lambda function, which runs only during initialconfiguration or when resources are updated or deleted. When you run this solution, you willnotice the Lambda functions in the AWS Management Console. While it may not appear active,do not delete the solution-helper function because it is necessary to manage associatedresources.

Step 2. Post-configuration stepsBy default, the demo producer and demo consumer applications will not run after the stacks are created.Use the following process to enable them.

1. Sign in to the Amazon Kinesis console and, from the left menu pane, select Analytics applications.2. On the Amazon Kinesis Data Analytics page, select Kda <application-name>.3. Choose Actions then choose Run application.4. Navigate to the AWS Systems Manager console and, from the left menu pane under Instances and

Nodes, select Session Manager.5. On the AWS Systems Manager page, choose Start session.6. On the Start a session page, select the <ec2-instance-id> for the KPL instance and choose Start

session.

Refer to the AWS CloudFormation Outputs tab for the Amazon EC2 instance ID.7. In the console window, run the following command to start the demo producer application.

(Replace <stream-name>, <aws-region>, and <seconds> with your information.

sudo java -jar /tmp/amazon-kinesis-replay-0.1.0.jar -streamName <stream-name> -streamRegion <region> -noWatermark -objectPrefix artifacts/kinesis-analytics-taxi-consumer/taxi-trips-partitioned.json.lz4/dropoff_year=2018/

NoteThe demo application uses an updated schema of the taxi dataset. You must specify acustom objectPrefix when running the preceeding command.

27

Page 32: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

IAM roles

SecurityWhen you build systems on AWS infrastructure, security responsibilities are shared between you andAWS. This shared model can reduce your operational burden as AWS operates, manages, and controls thecomponents from the host operating system and virtualization layer down to the physical security of thefacilities in which the services operate. For more information about security on AWS, refer to AWS CloudSecurity.

IAM rolesAWS Identity and Access Management (IAM) roles enable customers to assign granular accesspolicies and permissions to services and users in the AWS Cloud. This solution creates IAM roles forcommunication between services. For more information, refer to Providing Access to an AWS Service inthe IAM User Guide.

Security groupsThis solution creates a security group for the Amazon Kinesis Producer Library (KPL) instance so that itcan communicate with the Amazon Kinesis endpoint. This security group does not allow any inboundtraffic, and the instance can only be accessed via AWS Systems Manager Session Manager.

AuditingEach AWS service included in this solution is integrated with AWS CloudTrail, which captures all API calls.For more details, refer to the following documentation.

• Logging Kinesis Data Analytics API Calls with AWS CloudTrail• Logging calls to Amazon API Gateway APIs with AWS CloudTrail• Logging AWS Lambda API calls with AWS CloudTrail• Logging Amazon Kinesis Data Streams API Calls with AWS CloudTrail• Logging Amazon Kinesis Data Firehose API Calls with AWS CloudTrail

28

Page 33: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Additional Resources• Amazon API Gateway• Amazon CloudWatch• Amazon Elastic Compute Cloud• Amazon Kinesis Data Analytics• Amazon Kinesis Data Streams

• Amazon Simple Storage Service• AWS CloudFormation• AWS Identity and Access Management• AWS Lambda• Amazon Simple Queue Service• Amazon Kinesis Data Firehose

AWS Documentation

Developing Producers Using the Amazon Kinesis Producer Library

Best practices for monitoring and data protection:

• Viewing Kinesis Data Analytics Metrics and Dimensions• Using CloudWatch Alarms with Amazon Kinesis Data Analytics for Apache Flink• Security in Amazon Kinesis Data Analytics• Security in Amazon Kinesis Data Streams• Using AWS Lambda with Amazon Kinesis• Monitoring Kinesis Data Firehose Using CloudWatch Metrics• Security in Amazon Kinesis Data Firehose

29

Page 34: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Kinesis Producer LibraryThis solution's CloudFormation resources are created from AWS CDK components, and the resourcescontain the template code that you can use to customize this solution. For more information aboutsetting up your project and customizing this solution, refer to the README.md file in GitHub.

The Amazon Kinesis Producer Library (KPL) is used in the AWS CloudFormation template using AmazonElastic Compute Cloud (Amazon EC2), Amazon Kinesis Data Analytics, and Amazon CloudWatch. The KPLis an easy-to-use, highly configurable library that helps you write to Amazon Kinesis Data Streams. It actsas an intermediary between the producer application code and the Kinesis Data Streams API actions. Italso integrates seamlessly with the Kinesis Client Library (KCL) and submits Amazon CloudWatch metricson your behalf to provide visibility into producer performance.

The sample producer application included in this solution uses the following configuration:

Parameter Value Description

MaxConnections 1 The maximum number ofopen connections to thebackend. HTTP requests aresent in parallel over multipleconnections. The allowed rangeis from 1 to 256.

RequestTimeout 60000 The maximum elapsed time(in milliseconds) from when anHTTP request initiates and whenall responses are received. Therequest times out if the elapsedtime is exceeded.

RecordMaxBufferedTime 2000 The maximum amount of buffertime (in milliseconds) that arecord is allowed to be bufferedbefore it is sent.

AggregationEnabled false Choose whether to enableaggregation. With aggregation,multiple user records are packedinto a single KinesisRecord. Ifdisabled, each user record is sentin its own KinesisRecord.

For more information about configuration parameters, refer to the sample properties file on the AmazonKinesis Producer GitHub repository.

30

Page 35: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation GuideUsing the AWS Management Console

Uninstall the solutionYou can uninstall the AWS Streaming Data Solution for Amazon Kinesis using the AWS ManagementConsole or the AWS Command Line Interface (AWS CLI). However, the Amazon Simple Storage Service(Amazon S3) bucket and Amazon CloudWatch Logs created by this solution must be manually deleted.

Using the AWS Management Console1. Sign in to the AWS CloudFormation console.2. On the Stacks page, select the solution stack.3. Choose Delete.

Using AWS Command Line InterfaceDetermine whether AWS Command Line Interface (AWS CLI) is available in your environment. Forinstallation instructions, see What Is the AWS Command Line Interface in the AWS CLI User Guide. Afterconfirming the AWS CLI is available, run the following command.

$ aws cloudformation delete-stack --stack-name <cloudformation-stack-name>

Replace <cloudformation-stack-name> with the name of your CloudFormation stack.

Deleting the Amazon S3 bucketsThis solution is configured to retain the Amazon S3 buckets if you choose to delete the AWSCloudFormation stack to prevent against accidental data loss. After uninstalling the solution, you canmanually delete the S3 buckets if you do not need to retain the data. Use the following procedure todelete the Amazon S3 buckets.

1. Sign in to the Amazon S3 console.2. Choose Buckets from the left navigation pane.3. Locate the <stack-name> S3 buckets.4. Select one of the S3 buckets and choose Delete.

Repeat the steps until you have deleted all the <stack-name> S3 buckets.

Alternatively, you can configure the AWS CloudFormation template to delete the Amazon S3 bucketsautomatically. Before deleting the stack, change the deletion behavior in the AWS CloudFormationDeletionPolicy attribute.

Deleting the CloudWatch LogsThis solution retains the CloudWatch Logs if you decide to delete the AWS CloudFormation stack toprevent against accidental data loss. After uninstalling the solution, you can manually delete the logs ifyou do not need to retain the data. Use the following procedure to delete the CloudWatch Logs.

31

Page 36: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation GuideDeleting the Amazon Cognito User Pool

1. Sign in to the Amazon CloudWatch console.2. Choose Log Groups from the left navigation pane.3. Locate the log groups created by the solution.4. Select one of the log groups.5. Choose Actions and then choose Delete.

Repeat the steps until you have deleted all the solution log groups.

Alternatively, you can configure the AWS CloudFormation template to delete the CloudWatch Logsautomatically. Before deleting the stack, change the deletion behavior in the AWS CloudFormationDeletionPolicy attribute.

Deleting the Amazon Cognito User PoolThis solution retains the Cognito user pool if you decide to delete the AWS CloudFormation stack toprevent against accidental data loss. After uninstalling the solution, you can manually delete the userpool if you do not need to retain the data by using the following procedure:

1. Sign in to the Amazon Cognito console.2. Choose Manage User Pools.3. Locate the user pool created by the solution.4. Choose Delete pool.

Alternatively, you can configure the AWS CloudFormation template to delete the CloudWatch Logsautomatically. Before deleting the stack, change the deletion behavior in the AWS CloudFormationDeletionPolicy attribute.

32

Page 37: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Collection of operational metricsThis solution includes an option to send anonymous operational metrics to AWS. We use this data tobetter understand how customers use this solution and related services and products. When enabled, thefollowing information is collected and sent to AWS:

• Solution ID: The AWS solution identifier• Unique ID (UUID): Randomly generated, unique identifier for each solution deployment• Timestamp: The UTC formatted timestamp of when the event occurred• Data:The Region where the stack launched, request type (whether the stack was created, updated,

or deleted), and details about the option chosen (for example, shard count, whether enhancedmonitoring was enabled, buffering size, etc.). For example:

{'Pattern': 'KdsKdfS3', 'RetentionHours': '24', 'CompressionFormat': 'GZIP', 'BufferingInterval': '300', 'ShardCount': '2', 'EnhancedMonitoring': 'false', 'Version': 'v1.2.0', 'BufferingSize': '5', 'Region': 'us-east-1', 'RequestType': 'Create'}

Note that AWS owns the data gathered through this survey. Data collection is subject to the AWS PrivacyPolicy. To opt out of this feature, modify the AWS CloudFormation template mapping section from:

"AnonymousData": { "SendAnonymousData" : { "Data" : "Yes" }},

to

"AnonymousData": { "SendAnonymousData" : { "Data" : "No" }},

33

Page 38: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Source codeYou can visit our GitHub repository to download the templates and scripts for this solution, and to shareyour customizations with others.

34

Page 39: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

ContributorsThe following individuals contributed to this document:

• Daniel Pinheiro

35

Page 40: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

Document revisionsDate Change

August 2020 Initial release

September 2020 Release version 1.1.0: Refactored the AWS CloudFormationtemplate to use an AWS Solutions Construct; updated command tostart the demo producer application in Step 2 (p. 22) of AutomatedDeployment; for more information about changes for v1.1.0, referto the CHANGELOG.md file in the Github repository.

October 2020 Release version 1.2.0. Added additional template options. For moreinformation about changes for v1.2.0, refer to the CHANGELOG.mdfile in the Github repository.

January 2021 Release version 1.3.0. Added support for Apache Flink 1.11.1.For more information about changes for v1.2.0, refer to theCHANGELOG.md file in the Github repository.

April 2021 Release version 1.4.0. Updated demo applications to usethe Node.js 14x runtime. For more information, refer to theCHANGELOG.md file file in the GitHub repository.

May 2021 Release version 1.4.1 – Added Support for Apache Kafka versions2.8.0 and 2.6.2. For more information, refer to the CHANGELOG.mdfile in the GitHub repository.

July 2021 Release version 1.5.0 – Updated authentication for Option 1 touse Amazon Cognito user pools. For more information, refer to theCHANGELOG.md file in the GitHub repository.

36

Page 41: AWS Streaming Data Solution for Amazon Kinesis ......$0.02 TOTAL: $129.73 / month Option 3 The Option 3 table provides a cost estimate to deploy the aws-streaming-data-solution-for-kinesis-using-kinesis-data-firehose-and-amazon-s3

AWS Streaming Data Solution forAmazon Kinesis Implementation Guide

NoticesCustomers are responsible for making their own independent assessment of the information in thisdocument. This document: (a) is for informational purposes only, (b) represents AWS current productofferings and practices, which are subject to change without notice, and (c) does not create anycommitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or servicesare provided “as is” without warranties, representations, or conditions of any kind, whether express orimplied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and thisdocument is not part of, nor does it modify, any agreement between AWS and its customers.

AWS Streaming Data Solution for Amazon Kinesis is licensed under the terms of the of the ApacheLicense Version 2.0 available at The Apache Software Foundation.

37