reporting from the trenches: intuit & cassandra

32
Reporting from the Trenches How Intuit Uses Cassandra Effectively to Improve Customer Experiences Rekha Joshi, Staff Engineer Intuit, Inc. Thank you for joining. We will begin shortly.

Upload: datastax

Post on 16-Apr-2017

1.212 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Reporting from the Trenches: Intuit & Cassandra

Reporting from the Trenches – How Intuit Uses Cassandra Effectively to Improve Customer Experiences

Rekha Joshi, Staff EngineerIntuit, Inc.

Thank you for joining. We will begin shortly.

Page 2: Reporting from the Trenches: Intuit & Cassandra

Webinar Housekeeping

© 2015 DataStax, All Rights Reserved. 2

All attendees placed on mute

Input questions at any timeusing the online interface

Page 3: Reporting from the Trenches: Intuit & Cassandra

Speaker Bio

© 2015 DataStax, All Rights Reserved. 3

O’Reilly Certified Apache Cassandra Architect

Rekha JoshiStaff Engineer at Intuit

Inc.

Page 4: Reporting from the Trenches: Intuit & Cassandra

1 About Intuit

2 Use Case: Personalized A/B Testing 3 Database Requirements

4 Cassandra: Intuit NoSQL Standard

5 Using Cassandra Effectively

4© 2015 DataStax, All Rights Reserved.

Page 5: Reporting from the Trenches: Intuit & Cassandra

Intuit On Mission

© 2015 DataStax, All Rights Reserved. 5

Page 6: Reporting from the Trenches: Intuit & Cassandra

Intuit Data Platforms

© 2015 DataStax, All Rights Reserved. 6

50M+manage all of the data

complex compliancePublic and private cloud

customers to handle6+

petabytes of data

45M+ Customers

Manage all of the data 6+ Petabytes of data

Complex compliance

Page 7: Reporting from the Trenches: Intuit & Cassandra

Use Case: Personalized A/B Testing

© 2015 DataStax, All Rights Reserved. 7

Opinion-vs-Opinion Wars

Huge Investment

Angry Customer

Experiment, experiment, experiment!

Let Data Be The Decision Maker!

No Personalized A/B Testing?

With Personalized A/B Testing!!

Page 8: Reporting from the Trenches: Intuit & Cassandra

Use Case: Personalized A/B Testing

© 2015 DataStax, All Rights Reserved. 8

To Continuously Improve User Experience, Data Is Better Than Guess!

Page 9: Reporting from the Trenches: Intuit & Cassandra

Personalized A/B Testing Platform

© 2015 DataStax, All Rights Reserved. 9

User Assignment

Personalization Service

Segmentation Filters and Sampling

Personalization Engine

Analytics

Set up and administration

Profile Store

User Actions

A/B Testing Service

Page 10: Reporting from the Trenches: Intuit & Cassandra

Deployment

© 2015 DataStax, All Rights Reserved. 10

Monitoring

Alerting

Amazon CloudJenkinsCoopr ChefCloudformationECS/Docker

CloudwatchSplunkGraphiteGrafanaLogstashPrometheusNew Relic

SensuNew Relic AlertsHipchatPagerDuty

Page 11: Reporting from the Trenches: Intuit & Cassandra

Database Requirements

© 2015 DataStax, All Rights Reserved. 11

• High Data Security• No Data Loss• No Downtime• Linear Scalability• Tunable Consistency• Performance Under Workloads

Page 12: Reporting from the Trenches: Intuit & Cassandra

All This Data!!!!!

© 2015 DataStax, All Rights Reserved. 12

Page 13: Reporting from the Trenches: Intuit & Cassandra

Can I Lift This Alone?

© 2015 DataStax, All Rights Reserved. 13

Page 14: Reporting from the Trenches: Intuit & Cassandra

Need for Speed

© 2015 DataStax, All Rights Reserved. 14

Page 15: Reporting from the Trenches: Intuit & Cassandra

Cassandra, Who?

© 2015 DataStax, All Rights Reserved. 15

Cassandra is a Java based NoSQL, linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Page 16: Reporting from the Trenches: Intuit & Cassandra

Cassandra: The Hybrid Kid has the Edge!

© 2015 DataStax, All Rights Reserved. 16

DynamoDB(Amazon)

Big Table(Google)

Cassandra

Inherits data distribution Inherits data model

Masterless ArchitectureLinear Scalability Tunable Consistency/Performance

ApplicationQuery Access Patterns

influencing influencing

Page 17: Reporting from the Trenches: Intuit & Cassandra

Cassandra and DataStax Enterprise

© 2015 DataStax, All Rights Reserved. 17

Advanced Security

Integrated Analytics (Spark)

Advanced Tools

24/7 Support

Page 18: Reporting from the Trenches: Intuit & Cassandra

A Truly Successful Software

© 2015 DataStax, All Rights Reserved. 18

• Solves A Real Need• Is A Building Block for Platforms• Becomes Open Source• Gets Commercial Backing• Tools Ecosystem Builds Around It• Establishes Strong Users Base• Companies in Critical Domains use It!!

Page 19: Reporting from the Trenches: Intuit & Cassandra

Database Options

© 2015 DataStax, All Rights Reserved. 19

Page 20: Reporting from the Trenches: Intuit & Cassandra

Intuit and Cassandra

© 2015 DataStax, All Rights Reserved. 20

Cassandra = Intuit Technology Standard of Choice for NoSQL Distributed Database

High Data SecurityNo Data LossNo Downtime

Linear ScalabilityTunable ConsistencyOther NoSQL variants

Performance Under Workloads

Page 21: Reporting from the Trenches: Intuit & Cassandra

Did You Use Cassandra Effectively?

© 2015 DataStax, All Rights Reserved. 21

Page 22: Reporting from the Trenches: Intuit & Cassandra

Garbage Collection Issue

© 2015 DataStax, All Rights Reserved. 22

New objects created at faster rate, than they are GC’ed Can causes STOP-THE-WORLD GC pauses! •Configure Heap size, MAX_HEAP_SIZE•Set up GC logging CASSANDRA_HEAP_DIR•Configure CMS GC/G1GC•Automated Heap Dump•Upgrade System

Cassandra is a Java based NoSQL linearly scalable, fault tolerant, distributed time series database.

Page 23: Reporting from the Trenches: Intuit & Cassandra

Clock Issue

© 2015 DataStax, All Rights Reserved. 23

Ensure when you move setups/do upgrades, the ntp server is set correctly

Cassandra is a NoSQL linearly scalable, fault tolerant, distributed time series database.

Page 24: Reporting from the Trenches: Intuit & Cassandra

Understand the Node Ring

© 2015 DataStax, All Rights Reserved. 24

Repeat after me: Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Nodetool statusNodetool ringNodetool infoNodetool cfstatsNodetool tpstats

Page 25: Reporting from the Trenches: Intuit & Cassandra

What If A Node Goes Down?

© 2015 DataStax, All Rights Reserved. 25

ReplicationConsistencyNodetool repairNodetool decommissionNodetool snapshots

Cassandra is a NoSQL linearly scalable, fault tolerant, distributed, masterless time series database.

Page 26: Reporting from the Trenches: Intuit & Cassandra

Tuning The Application

© 2015 DataStax, All Rights Reserved. 26

Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed, masterless, time series database.

Refactor data modelRevisit the usage access patternsParanoid Monitoring

Page 27: Reporting from the Trenches: Intuit & Cassandra

Tuning For Reads

© 2015 DataStax, All Rights Reserved. 27

• Caching Layer – Key Cache/Row Cache• SSTable Compactions Frequency

• Multiple SSTable inefficient

Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed time series database.

Page 28: Reporting from the Trenches: Intuit & Cassandra

Tuning For Writes

© 2015 DataStax, All Rights Reserved. 28

Cassandra is a Java based NoSQL linearly scalable, best in class tunable performance, fault tolerant, distributed time series database.

• Memtable – Fast Writes• CommitLog – Separate Dedicated Disk

Page 29: Reporting from the Trenches: Intuit & Cassandra

Tuning the System

© 2015 DataStax, All Rights Reserved. 29

EXT4 Filesystem System Memory, CPU, DiskParanoid Monitoring

Cassandra is a NoSQL linearly scalable, fault tolerant, distributed, masterless time series database.

Page 30: Reporting from the Trenches: Intuit & Cassandra

Little Talked Aspect Of The Pareto Principle!

© 2015 DataStax, All Rights Reserved. 30

Page 31: Reporting from the Trenches: Intuit & Cassandra

Heavy Lifting? Easy!

© 2015 DataStax, All Rights Reserved. 31

Page 32: Reporting from the Trenches: Intuit & Cassandra

© 2015 DataStax, All Rights Reserved. 32

Thank you!

Input questions at any timeusing the online interface

Q & A

https://www.linkedin.com/in/rekhajoshmhttps://twitter.com/rekhajoshm