aws re:invent 2016: fanatics: deploying scalable, self-service business intelligence on aws (bda207)
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rahul Bhartia, Amazon Web Service, Principal Solutions Architect
Amit Jain, Fanatics, Sr. Manager - BI Platform and Reporting
December 1, 2016
Deploying Scalable, Self-Service Business Intelligence on AWS
featuring
What to Expect from the Session
• Learn about various Business Intelligence (BI) solutions on AWS
• Hear from Fanatics about their scalable & elastic BI stack on AWS
• What this session is not about
• Various BI solutions and their features
BI on AWS
Amazon QuickSight AWS Big Data Competency partners
Get Started Quickly
• Amazon QuickSight – Get started today!!!
• Managed on AWS
• Tableau Online, Microstrategy Cloud, ChartIO, WingArc1st
• AWS Marketplace
• Tableau Server, Tibco Jaspersoft, Microstrategy, Looker
BI workloads on AWS Cloud
Self-service Scalable
BI workloads on AWS Cloud
Self-service Scalable
Making BI self-service on AWS
Get started today!
https://quicksight.aws/
Managed offerings
vs.
self-managed
Amazon QuickSight AWS Big Data Competency partners
Managing yourself
• Custom integration
• AWS CloudFormation
• Automate Tableau - https://github.com/tableau/server-install-
script-samples
• AWS Marketplace 1-click
• Tibco Jaspersoft
• Looker
Self-service Scalable
BI workloads on AWS Cloud
Scale – Infrastructure
• Scale-out
• Cluster or Distributed
• Scale-up
• Leverage bigger or more specific instances
• Scale-with
• Scale the underlying data-store
Tableau Online – Scaling on AWS
Tableau online on AWS
Scale – Data
Create an In-Memory
aggregation - Extracts or
Cubes or Cache
Leverage the underlying
cluster - In-database, Live
connection, Live Connect
Amazon QuickSight AWS Big Data Competency partners
Also, remember to
1. Leverage the right AWS Services1. Amazon RDS
2. Amazon Redshift
3. Amazon EMR
4. Amazon S3
2. Leverage integrations with AWS Services1. Amazon QuickSight – Direct ingestion from Amazon S3
2. Microstrategy –VLDB Properties for Amazon Redshift
3. Looker & ChartIO – Amazon EMR (Spark-SQL/Presto)
Largest retailer of
officially
licensed sports
merchandise
All Major US Leagues
If you are a sports fan, you’ve likely had a Fanatics
Experience
26,000,000
Minutes of customer contact
250,000,000
Visitors across Fanatics’
platform of sites
31,000,000
Units shipped annually
6,000
Peak season employees
(1,700 non-peak)
Major Scale, Advantage
$1B in sales
through eCommerce
and sport venues
Business Centric Technology Centric
Financials
Inventory
Customer
Support
Marketing
Experimentation
S
I
T
E
S
E
R
V
I
C
E
S
Engineering
Hardware
Site Performance
Click Stream
Personalization
2016 - Data and Analytics everywhere
18
InfrastructureAnalytical Content
Developers
Scaling our BI environment
Current Fanatics Data Architecture
SSIS Stone Branch Spark
Data Integration
Qubole PIGAttunity
Data Platform
400 TB
Data Warehouse
FanHouse EDW (Redshift)
100 TB
Relational DataLegacy Storage
Football (SQL Server)
500 TB
Unstructured DataPattern Detection
Deep Storage
HADOOP CLUSTERS
Analyze & Report Discover & Explore
MS Excel Tableau
Data Access
SOA/DAL SQL Custom AppsSSRS MicroStrategy
Business Centric Technology Centric
R
Evolution timeline on AWS
Microstrategy
‘’06 ’08 ‘18’End ’14 ‘15 Nov ‘16 ‘17
Access DB &
MS Excel Reports
(3 MB)
SQL Server
SSIS &
SSRS
(500 GB)
Redshift
S3
Spark
Presto
Storm/Kafka/Scala
Real Time
Reporting
R Integration
Machine
Learning
Tableau
Hadoop
Why we chose AWS
Scalability & Agility
Elasticity and Cost
Automation & Self-service
Availability & Disaster Recovery
Why we chose
Microstrategy TableauEnterprise Business Intelligence Data Discovery and Prototyping
Our Journey with Microstrategy
02-2015
TECH ASSESSMENT
10 LICENSES
{T2.XLARGE} (WIN /
ACCESS MD)
03-2015
IN PRODUCTION
DISTRIBUTION SERVICES
REPORTS {T2.XLARGE , RDS)
06-2015
WEB USERS ALPHA
{M4.4XLARGE
(WIN),RDS)
07-2015
WEB USERS PRODUCTION
{R3.4X LARGE (LINUX),
M4.4X LARGE (WIN),
RDS)
09-2016
500 WEB USERS
7 CUBES (AVG 100
MILL ROWS)
3-10 SEC CUBE
RESPONSE
(AWS X1 INSTANCE)
2017 Goal : 1500+ Users
All adhoc users on Microstrategy
DELIVER FAST, GATHER FEEDBACK, IMPROVE
Just took 1 month to be in Production
Current Microstrategy Architecture
500 WEB USERS
AWS X1
INSTANCE*
1 TB OF RAM, 8
TB OF SSD
CAN BE
CLUSTERED
Our Journey with Tableau
02-2015
TECH ASSESSMENT
TABLEAU ONLINE
(2 DESKTOP , 5
USERS)
03-2015
IN PRODUCTION
TABLEAU ONLINE
(3 DESKTOP, 30 USERS)
06-2015
OWN TABLEAU
ENVIRONMENT ON AWS
(5 DESKTOP USERS, 75
WEB USERS)
02-2016
HARDWARE UPGRADE ON
AWS
(12 DESKTOP USERS , 125
WEB USERS)
Why we chose AWS
Scalability & Agility
Elasticity and Cost
Automation & Self-service
Availability & Disaster Recovery
Tableau Capacity Management
Regular Capacity(Single Server)
Peak Capacity (Distributed Workers)
Cost Control (CloudWatch and Tags)
MLB WORLD
SERIES FINALS
DR
TESTINGTABLEAU 10
UPGRADE TEST
AWS X1
INSTANCE
Why we chose AWS
Scalability & Agility
Elasticity and Cost
Automation & Self-service
Availability & Disaster Recovery
Self-service for the users
Web Based Command Manager and tabadmin
• Web-service based event triggering & control mechanism
• Triggers both Microstrategy and Tableau events
• No need for client installation• Has offset (or delay) mechanism• Saves Significant resources and
complexity for ETL and Database
http://bitechapi.fanatics.corp:8080/FanBiAutomation/WBCM?triggername=testAmitemail&cmd=tableau&cmddelay=0&projname =
Microstrategy Systems Manager for Cluster Capacity
• Launch a New Governed AWS
Instance
• Automatically Start I-server
• Add to existing Microstrategy
Cluster
Why we chose AWS
Scalability & Agility
Elasticity and Cost
Automation & Self-service
Availability & Disaster Recovery
Run Hot/Cold Stand By Machines
• Disaster Recovery
• Redundant deployment in different Availability Zones
• Cold Stand By with a higher RPO/RTO
• Availability
• During Critical Business events/seasons
• Hot Stand By with instant failover capability
Best Practices for BI on AWS
• Automate
• Use CloudFormation Templates
• AMIs (and Maintain them)
• Distribute the workload
• Managed shared storage (EFS)
• Flexible infrastructure
• Microstrategy and Tableau
• Monitor your cost and budget
• CloudWatch Metrics and Tags
36
3-10 second Data Exploration time for Business Users
Real time Reporting (consume elastic search web services)
Distribute 100s of PDF Reports Daily from the same Metadata and Infrastructure, Run 1000s of Jobs per hour
Site Data services based on the same Metadata
Some Business Use Cases Solved
37Custom BI Portal and Real time analytics using AWS
38
Hybrid Ownership Model –Hardware on AWS / Owned Software
Always buy “User Based Licenses” – Never CPU Core
The BI Platform should be scalable and should have
enough Automation APIs to mimic cloud functionality
Experiment with small number of user licenses to prototype (start with 1 or 2 user license)
Try not to get locked in : If your vendor is only subscription based then you are locked in
Cloud BI Vs On Premises BI (Get the best of both)
Thank you!
Remember to complete
your evaluations!
Related Sessions