aws re:invent 2016: fanatics: deploying scalable, self-service business intelligence on aws (bda207)

Post on 08-Jan-2017

83 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Rahul Bhartia, Amazon Web Service, Principal Solutions Architect

Amit Jain, Fanatics, Sr. Manager - BI Platform and Reporting

December 1, 2016

Deploying Scalable, Self-Service Business Intelligence on AWS

featuring

What to Expect from the Session

• Learn about various Business Intelligence (BI) solutions on AWS

• Hear from Fanatics about their scalable & elastic BI stack on AWS

• What this session is not about

• Various BI solutions and their features

BI on AWS

Amazon QuickSight AWS Big Data Competency partners

Get Started Quickly

• Amazon QuickSight – Get started today!!!

• Managed on AWS

• Tableau Online, Microstrategy Cloud, ChartIO, WingArc1st

• AWS Marketplace

• Tableau Server, Tibco Jaspersoft, Microstrategy, Looker

BI workloads on AWS Cloud

Self-service Scalable

BI workloads on AWS Cloud

Self-service Scalable

Making BI self-service on AWS

Get started today!

https://quicksight.aws/

Managed offerings

vs.

self-managed

Amazon QuickSight AWS Big Data Competency partners

Managing yourself

• Custom integration

• AWS CloudFormation

• Automate Tableau - https://github.com/tableau/server-install-

script-samples

• AWS Marketplace 1-click

• Tibco Jaspersoft

• Looker

Self-service Scalable

BI workloads on AWS Cloud

Scale – Infrastructure

• Scale-out

• Cluster or Distributed

• Scale-up

• Leverage bigger or more specific instances

• Scale-with

• Scale the underlying data-store

Tableau Online – Scaling on AWS

Tableau online on AWS

Scale – Data

Create an In-Memory

aggregation - Extracts or

Cubes or Cache

Leverage the underlying

cluster - In-database, Live

connection, Live Connect

Amazon QuickSight AWS Big Data Competency partners

Also, remember to

1. Leverage the right AWS Services1. Amazon RDS

2. Amazon Redshift

3. Amazon EMR

4. Amazon S3

2. Leverage integrations with AWS Services1. Amazon QuickSight – Direct ingestion from Amazon S3

2. Microstrategy –VLDB Properties for Amazon Redshift

3. Looker & ChartIO – Amazon EMR (Spark-SQL/Presto)

Largest retailer of

officially

licensed sports

merchandise

All Major US Leagues

If you are a sports fan, you’ve likely had a Fanatics

Experience

26,000,000

Minutes of customer contact

250,000,000

Visitors across Fanatics’

platform of sites

31,000,000

Units shipped annually

6,000

Peak season employees

(1,700 non-peak)

Major Scale, Advantage

$1B in sales

through eCommerce

and sport venues

Business Centric Technology Centric

Financials

Inventory

Customer

Support

Marketing

Experimentation

S

I

T

E

S

E

R

V

I

C

E

S

Engineering

Hardware

Site Performance

Click Stream

Personalization

2016 - Data and Analytics everywhere

18

InfrastructureAnalytical Content

Developers

Scaling our BI environment

Current Fanatics Data Architecture

SSIS Stone Branch Spark

Data Integration

Qubole PIGAttunity

Data Platform

400 TB

Data Warehouse

FanHouse EDW (Redshift)

100 TB

Relational DataLegacy Storage

Football (SQL Server)

500 TB

Unstructured DataPattern Detection

Deep Storage

HADOOP CLUSTERS

Analyze & Report Discover & Explore

MS Excel Tableau

Data Access

SOA/DAL SQL Custom AppsSSRS MicroStrategy

Business Centric Technology Centric

R

Evolution timeline on AWS

Microstrategy

‘’06 ’08 ‘18’End ’14 ‘15 Nov ‘16 ‘17

Access DB &

MS Excel Reports

(3 MB)

SQL Server

SSIS &

SSRS

(500 GB)

Redshift

S3

Spark

Presto

Storm/Kafka/Scala

Real Time

Reporting

R Integration

Machine

Learning

Tableau

Hadoop

Why we chose AWS

Scalability & Agility

Elasticity and Cost

Automation & Self-service

Availability & Disaster Recovery

Why we chose

Microstrategy TableauEnterprise Business Intelligence Data Discovery and Prototyping

Our Journey with Microstrategy

02-2015

TECH ASSESSMENT

10 LICENSES

{T2.XLARGE} (WIN /

ACCESS MD)

03-2015

IN PRODUCTION

DISTRIBUTION SERVICES

REPORTS {T2.XLARGE , RDS)

06-2015

WEB USERS ALPHA

{M4.4XLARGE

(WIN),RDS)

07-2015

WEB USERS PRODUCTION

{R3.4X LARGE (LINUX),

M4.4X LARGE (WIN),

RDS)

09-2016

500 WEB USERS

7 CUBES (AVG 100

MILL ROWS)

3-10 SEC CUBE

RESPONSE

(AWS X1 INSTANCE)

2017 Goal : 1500+ Users

All adhoc users on Microstrategy

DELIVER FAST, GATHER FEEDBACK, IMPROVE

Just took 1 month to be in Production

Current Microstrategy Architecture

500 WEB USERS

AWS X1

INSTANCE*

1 TB OF RAM, 8

TB OF SSD

CAN BE

CLUSTERED

Our Journey with Tableau

02-2015

TECH ASSESSMENT

TABLEAU ONLINE

(2 DESKTOP , 5

USERS)

03-2015

IN PRODUCTION

TABLEAU ONLINE

(3 DESKTOP, 30 USERS)

06-2015

OWN TABLEAU

ENVIRONMENT ON AWS

(5 DESKTOP USERS, 75

WEB USERS)

02-2016

HARDWARE UPGRADE ON

AWS

(12 DESKTOP USERS , 125

WEB USERS)

Why we chose AWS

Scalability & Agility

Elasticity and Cost

Automation & Self-service

Availability & Disaster Recovery

Tableau Capacity Management

Regular Capacity(Single Server)

Peak Capacity (Distributed Workers)

Cost Control (CloudWatch and Tags)

MLB WORLD

SERIES FINALS

DR

TESTINGTABLEAU 10

UPGRADE TEST

AWS X1

INSTANCE

Why we chose AWS

Scalability & Agility

Elasticity and Cost

Automation & Self-service

Availability & Disaster Recovery

Self-service for the users

Web Based Command Manager and tabadmin

• Web-service based event triggering & control mechanism

• Triggers both Microstrategy and Tableau events

• No need for client installation• Has offset (or delay) mechanism• Saves Significant resources and

complexity for ETL and Database

http://bitechapi.fanatics.corp:8080/FanBiAutomation/WBCM?triggername=testAmitemail&cmd=tableau&cmddelay=0&projname =

Microstrategy Systems Manager for Cluster Capacity

• Launch a New Governed AWS

Instance

• Automatically Start I-server

• Add to existing Microstrategy

Cluster

Why we chose AWS

Scalability & Agility

Elasticity and Cost

Automation & Self-service

Availability & Disaster Recovery

Run Hot/Cold Stand By Machines

• Disaster Recovery

• Redundant deployment in different Availability Zones

• Cold Stand By with a higher RPO/RTO

• Availability

• During Critical Business events/seasons

• Hot Stand By with instant failover capability

Best Practices for BI on AWS

• Automate

• Use CloudFormation Templates

• AMIs (and Maintain them)

• Distribute the workload

• Managed shared storage (EFS)

• Flexible infrastructure

• Microstrategy and Tableau

• Monitor your cost and budget

• CloudWatch Metrics and Tags

36

3-10 second Data Exploration time for Business Users

Real time Reporting (consume elastic search web services)

Distribute 100s of PDF Reports Daily from the same Metadata and Infrastructure, Run 1000s of Jobs per hour

Site Data services based on the same Metadata

Some Business Use Cases Solved

37Custom BI Portal and Real time analytics using AWS

38

Hybrid Ownership Model –Hardware on AWS / Owned Software

Always buy “User Based Licenses” – Never CPU Core

The BI Platform should be scalable and should have

enough Automation APIs to mimic cloud functionality

Experiment with small number of user licenses to prototype (start with 1 or 2 user license)

Try not to get locked in : If your vendor is only subscription based then you are locked in

Cloud BI Vs On Premises BI (Get the best of both)

Thank you!

Remember to complete

your evaluations!

Related Sessions

top related