Transcript
Page 1: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Hailo - a case study for Cassandra & Acunu

dai cleggoctober 2013JAX London

Page 2: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

What is Hailo?

‣ The world’s highest-rated taxi app – over 11,000 five-star reviews

‣ Over 500,000 registered passengers

‣ A Hailo hail is accepted around the world every 4 seconds

‣ Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in nearly 2 years of operation

2

Page 3: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

The Adoption of Cassandra & Acunu at Hailo

‣ Launched on AWS

‣ Two PHP/MySQL web apps plus a Java backend

‣ Mostly built by a team of 3 or 4 backend engineers

‣ MySQL multi-master for single available zone resilience

‣ Get/create/update entity

‣ Analytics

‣ Text search

3

Page 4: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

The Adoption of Cassandra & Acunu at Hailo

‣ A desire for greater resilience – “become a utility”

‣ Cassandra is designed for high availability

‣ Plans for international expansion around a single consumer app

‣ Cassandra is good at global replication

‣ Expected growth

‣ Cassandra scales linearly for both reads and writes

‣ Prior experience

‣ successful in-team experience with Cassandra

4

Page 5: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

The Adoption of Cassandra & Acunu at Hailo

‣ Replacement of key consumer app functionality,

‣ split PHP/MySQL web app into:

‣ a mixture of PHP/Java services

‣ backed by a Cassandra data store

‣ Launched into production in September 2012

‣ originally just powering North American expansion,

‣ gradually switching over Dublin and London

5

Page 6: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

The Adoption of Cassandra & Acunu at Hailo

‣ Further decompose functionality into Go/Java SOA

‣ Migrating:

‣ Entity databases to Cassandra

‣ Analytics to Acunu

‣ Search into Elastic Search

6

Page 7: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Cassandra

Page 8: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

“Cassandra just works”Dom W, Senior Engineer, Hailo

8

Page 9: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Some Considerations for Data Modeling

‣ Do not read the entire entity, update one property and then write back a mutation containing every column

‣ Only mutate columns that have been set

‣ This avoids read-before-write race conditions

‣ Choose row key carefully, since this partitions the records

‣ Think about how many records you want in a single row

‣ Denormalise on write into many indexes/views

9

Page 10: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

not obvious!

Some Considerations for Data Modeling

10Average years experience per team member

MySQL Cassandra

10

Page 11: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

whoops!

Some Repercussions of Data Modeling

11

Page 12: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Some considerations for Application Development

People who canattempt to queryMySQL

People who canattempt to

query Cassandra

12

Page 13: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Some Considerations for Applications development

13

Page 14: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Acunu Analytics

Page 15: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Hailo needed to understand system performance/business SLAs

Acunu Analytics

‣ Raw Cassandra lacks analytic primitives

‣ eg: COUNT, SUM, AVG, GROUP BY

‣ Acunu Analytics provides a platform for real time

‣ for pre-planned query templates

‣ It uses Cassandra as the store

‣ so it is highly available, resilient and globally distributed

‣ Integration is straightforward

15

Page 16: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Acunu Analytics: technology

16

Real-time incremental cubing provides instant answers to Big Data questions

build cube from history

Page 17: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Acunu Analytics: technology

17

Apache Cassandra is the repository

build cube from history

Apache Cassandra

Page 18: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Acunu Analytics: an example

18

build cube from history

Define aggregate cubes:CREATE CUBE APPROX TOP(keyword) WHERE browser, time GROUP BY time

New events update cubes

Rich instant queries over cubes SELECT TOP(keyword) FROM table WHERE browser = ‘chrome’ AND time BETWEEN.. GROUP BY d1, d2, ... JOIN ... HAVING .. ORDER BY ..

Drill down to raw events Populate new cubes from historic data

Page 19: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Overview of the workflow

Acunu Analytics: summary

19

define aggregation cubes with DDL or infer from self-service queries

define connector: either from library, toolkit or REST

define pre-processors: programmatic, Java or

Javascript; or AQL query

develop queries in AQL, query builder or self-service data explorer

invoke queries from within applications with JSON query API

populate new cubes from historic data

define event schema with DDL or infer from sample events

fill cube from history

define alerts to be raised on trigger conditions

Page 20: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

some sample screenshots

Acunu Analytics at Hailo

“drill-across” to see breakdown of data

and in-depth analysis

20

Page 21: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

use cases

Acunu Analytics at Hailo

‣ Infrastructure and Application monitoring

‣ Real-time A/B testing of app layout and incentives

‣ Real time geo-view of supply/demand for drivers

‣ More in the pipeline

21

Page 22: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Conclusions

Page 23: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Conclusions

‣ Solid Cassandra design

‣ High availability characteristics

‣ Easy multi-data centre setup

‣ Simplicity of operation

‣ With Acunu

‣ SQL-like rich queries

‣ easier data modeling

Choosing the Platform

23

Page 24: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunu

Conclusions

‣ Have an advocate

‣ sell the dream

‣ Learn the fundamentals

‣ get the best out of Cassandra

‣ Invest in tools to make life easier

‣ Keep management in the loop

‣ explain the trade offs

Exploiting the platform

24

Page 25: Acunu and Hailo: a realtime analytics case study on Cassandra

@daiclegg @acunuApache, Apache Cassandra, Cassandra and the eye logo are trademarks of the Apache Software Foundation.

Thank You.


Top Related