introduction to apache accumulo

88
1 © 2015 Cloudera, Inc. licensed CC-BY-SA 2.0 How to use this presentation Covered topics: Accumulo architecture, operational maintenance, fault handling Intended Audience: Developers, supporters, PMs who are conversant in multi-component systems, i.e. involved in web services. Presumes familiarity with RDBMS Expected running time: 40 - 60 minutes License: CC-BY-SA 2.0 Please let me know if you find it useful and what it could use: [email protected]

Upload: busbey

Post on 19-Jul-2015

153 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Introduction to Apache Accumulo

1© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

How to use this presentation

• Covered topics: Accumulo architecture, operational maintenance, fault handling

• Intended Audience: Developers, supporters, PMs who are conversant in multi-component systems, i.e. involved in web services.

• Presumes familiarity with RDBMS

• Expected running time: 40 - 60 minutes

• License: CC-BY-SA 2.0

• Please let me know if you find it useful and what it could use: [email protected]

Page 2: Introduction to Apache Accumulo

Introduction to Apache AccumuloScaling a web application made easier

Sean Busbey // Software Engineer

Page 3: Introduction to Apache Accumulo

3© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Let’s talk about Apache Accumulo…

Page 4: Introduction to Apache Accumulo

4© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

But in the context of a specific use case

•I really like technology that solves a problem.

•Keep in mind that this won’t be exhaustive.

•YMMV, proof-of-concepts with metrics are better than slides.

Page 5: Introduction to Apache Accumulo

5© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Who am I?

• Apache Accumulo PMC

• Apache HBase committer

• Software Engineer on Cloudera’s storage team

Page 6: Introduction to Apache Accumulo

6© 2015 Cloudera licensed CC-BY-SA 2.0

That is to say, I work for a vendor and no longer have operational scale problems of my own.

Page 7: Introduction to Apache Accumulo

We’ll focus on an application that enables conversations centered on cute cats.

Page 8: Introduction to Apache Accumulo

8© 2015 Cloudera licensed CC-BY-SA 2.0

Page 9: Introduction to Apache Accumulo

9© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Simple sharing model built with privacy

controls

•User defines a group that may see their posting

•User posts a picture to a given group

•Members of the group may write short messages

Page 10: Introduction to Apache Accumulo

10© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Straight forward web architecture

Page 11: Introduction to Apache Accumulo

11© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Relational Data Model

Will map user names to identifiers used elsewhere.

Will track ownership and descriptive name.

Will allow users to add and remove members.

User table Group table Group membership table

Page 12: Introduction to Apache Accumulo

12© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Relational Data Model

Tracks distribution group, owner, and topical image.

Individual comments from users.

Topic table Comment table

Page 13: Introduction to Apache Accumulo

13© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

First growth: robustness

Page 14: Introduction to Apache Accumulo

14© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

First growth: robustness

Page 15: Introduction to Apache Accumulo

15© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Second growth: application scale out

Page 16: Introduction to Apache Accumulo

16© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Scaling reads: what goes into this page?

Page 17: Introduction to Apache Accumulo

17© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Database reads eventually become a

bottleneck

Page 18: Introduction to Apache Accumulo

18© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Scale by de-normalizing in favor of reads

Page 19: Introduction to Apache Accumulo

19© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Change to writes - original

Page 20: Introduction to Apache Accumulo

20© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Change to writes – de-normalized

Page 21: Introduction to Apache Accumulo

Generally known

as the fan-out

pattern.

21© 2015 Cloudera licensed CC-BY-SA 2.0

Page 22: Introduction to Apache Accumulo

22© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

The trick is to not get crushed by the writes

•Each poster now does a write for each member of the group a post goes to.

•Removing access is now a much larger delete query.

•Most databases are geared toward few writes and many reads; are we screwed?

Page 23: Introduction to Apache Accumulo

23© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Recall our access pattern

Page 24: Introduction to Apache Accumulo

Basically one of

these consumer

boxes.

24© 2015 Cloudera licensed CC-BY-SA 2.0

Page 25: Introduction to Apache Accumulo

25© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Lines up very well with sharding

•Divide the query space up by e.g. a hash of user id into n shards.

•Store a copy of table on each shard, but just for user ids that hash to that shard.

•Reads and writes are spread across instances.

Page 26: Introduction to Apache Accumulo

26© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Database shards Layout

Page 27: Introduction to Apache Accumulo

27© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

What were the nice-to-haves for the RDBMS

again?

• No longer leveraging relational data model.

• Now running, backing up, and failing over num shards number of database instances.

• Robustness in a shard has to be managed.

• Sharding is essentially static; adding more resources with growth still painful.

Page 28: Introduction to Apache Accumulo

28© 2015 Cloudera licensed CC-BY-SA 2.0

Now we have some context for Accumulo.Our goal is to end up with less operational overhead.

Page 29: Introduction to Apache Accumulo

29© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

“The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.”Accumulo PMC via https://accumulo.apache.org/

Page 30: Introduction to Apache Accumulo

30© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Accumulo-based App Layout

Page 31: Introduction to Apache Accumulo

31© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

“The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.”Accumulo PMC via https://accumulo.apache.org/

Page 32: Introduction to Apache Accumulo

32© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

In Accumulo, you address cells rather than

records

Key Valu

e

Page 33: Introduction to Apache Accumulo

33© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Keys are multi-dimensional

Key Valu

eRo

w

Column Tim

e

Page 34: Introduction to Apache Accumulo

34© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Keys are multi-dimensional

Key Valu

eRo

w

Column Tim

eFamily Qualifier Visibility

Page 35: Introduction to Apache Accumulo

35© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Accumulo doesn’t assume a schema

•All key and value components, save time, are byte[]

•The application is responsible for serialization

•Common to use different serialization for the values in different columns.

Page 36: Introduction to Apache Accumulo

36© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Mapping records to cells

•Treat a row as a database

• Essentially each column is a record field

•Treat each cell as a database record

• Need to uniquely identify each record

• Useful if you generally need the whole row and not a subset of columns

• Can then treat each row as a shard of database records.

Page 37: Introduction to Apache Accumulo

37© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Let’s use a concrete example.

Page 38: Introduction to Apache Accumulo

38© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Already know our reads are within a shard.

Page 39: Introduction to Apache Accumulo

39© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Mapping our data into cells

Key Value

Row Column Family Column Qualifier Visibility author, image url,

and commentreader id discussion id comment order group id

Page 40: Introduction to Apache Accumulo

40© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

We end up with something close to our

original.

Page 41: Introduction to Apache Accumulo

41© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Note the use of visibility

Page 42: Introduction to Apache Accumulo

42© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Visibility enforcement

•At scan time, our application will pass in the groups for the current user.

•Accumulo will filter any cells that don’t match those groups.

• Group removal is a simple update in the group management system again.

Page 43: Introduction to Apache Accumulo

43© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Sparse column storage

•We are creating lots of columns: per discussion per group member.

•Accumulo only stores columns that exist in a given row.

Page 44: Introduction to Apache Accumulo

44© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

“The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.”Accumulo PMC via https://accumulo.apache.org/

Page 45: Introduction to Apache Accumulo

45© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

All cells sorted according to key

• Total ordering based on lex-sort of raw byte arraysof key components.

• Time is sorted most-recent-first

• Reads are done on a contiguous range of cells.

Page 46: Introduction to Apache Accumulo

46© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

When sorted our data looks like this….

Page 47: Introduction to Apache Accumulo

47© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

And the scan for a page is roughly…

Page 48: Introduction to Apache Accumulo

48© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Lexicoders

• Turning different kinds of data into sortable bytes is painful

• Accumulo ships implementations for several common Java types

• Also for e.g. reversing the sort order and building compound keys.

Page 49: Introduction to Apache Accumulo

49© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Inefficiencies in our data model

Key Value

Row Column Family Column Qualifier Visibility author, image url,

and commentreader id discussion id comment order group id

Page 50: Introduction to Apache Accumulo

50© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Two categories of data

Key Value

Row Column Family Column Qualifier Visibility author, image url

reader id discussion id image group id

Key Value

Row Column Family Column Qualifier Visibility author, comment

reader id discussion id text group id

Page 51: Introduction to Apache Accumulo

51© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

And now our data looks like this

Page 52: Introduction to Apache Accumulo

52© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

And the scan for a page covers less data

Page 53: Introduction to Apache Accumulo

53© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

“The Apache Accumulo™ sorted, distributedkey/value store is a robust, scalable, high performance data storage and retrieval system.”Accumulo PMC via https://accumulo.apache.org/

Page 54: Introduction to Apache Accumulo

54© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Our simplified diagram

Page 55: Introduction to Apache Accumulo

55© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Slightly less simplified

Page 56: Introduction to Apache Accumulo

56© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Back to the data model

Key Valu

eRo

w

Column Tim

eFamily Qualifier Visibility

Page 57: Introduction to Apache Accumulo

57© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Back to the data model

Key Valu

eRo

w

Column Tim

eFamily Qualifier Visibility

Page 58: Introduction to Apache Accumulo

58© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Rows are grouped into Tablets

• Tablet is defined by a start and end row

• All cells for a given row must be in the same Tablet.

Page 59: Introduction to Apache Accumulo

59© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Tablets are assigned to Tablet Servers

• At any given point in time, a Tablet is serviced by a single Tablet Server

Page 60: Introduction to Apache Accumulo

60© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Slightly less simplified

Page 61: Introduction to Apache Accumulo

61© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Tablets are assigned to Tablet Servers

• At any given point in time, a Tablet is serviced by a single Tablet Server

• That server is responsible for client reads and writes to all hosted Tablets

• Finding the proper server is handled by the Accumulo libraries

• Proper key design means io load gets spread across multiple machines

Page 62: Introduction to Apache Accumulo

62© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

“The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.”Accumulo PMC via https://accumulo.apache.org/

Page 63: Introduction to Apache Accumulo

63© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Tablet assignment is not static

• Assignment tend to have steady state

• But can move in the event of new resources or failure

Page 64: Introduction to Apache Accumulo

64© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Remember our RDBMS scaling?

Page 65: Introduction to Apache Accumulo

65© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

New RDBMS shard

1. Provision hardware for service

2. Rewrite data under new sharding

3. Update application services

• Doing this without an outage is hard work (and well paid if you can get it)

Page 66: Introduction to Apache Accumulo

66© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

New Accumulo Tablet Server

1. Provision hardware for service

2. Add server to cluster

3. Tablets automatically migrate from busier nodes to new node

• No outage from client perspective.

Page 67: Introduction to Apache Accumulo

67© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

“The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.”Accumulo PMC via https://accumulo.apache.org/

Page 68: Introduction to Apache Accumulo

68© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

All distributed systems have communication

failures

In the face of such a failure you can either

• remain available on remaining nodes to all clients

• provide a consistent view of updates to a subset of clients

Page 69: Introduction to Apache Accumulo

69© 2015 Cloudera licensed CC-BY-SA 2.0

Now you know the basics of CAPRemember that you can’t give up partition tolerance

Page 70: Introduction to Apache Accumulo

70© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Remember our RDBMS robustness?

Page 71: Introduction to Apache Accumulo

71© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Accumulo is a CP system

• Tablet Servers ensure that updates have been written to a distributed write-ahead-log before acknowledging

• Tablet Server failures are automatically detected

• Newly assigned hosts for recovered Tablets then replay edits up until last ack before serving new requests

Page 72: Introduction to Apache Accumulo

72© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Page 73: Introduction to Apache Accumulo

73© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Client write

Page 74: Introduction to Apache Accumulo

74© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Write goals

• Low latency ack

• Don’t lose acked writes in face of node failure

Page 75: Introduction to Apache Accumulo

75© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Client write

1

Page 76: Introduction to Apache Accumulo

76© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Client write

1

2

Page 77: Introduction to Apache Accumulo

77© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Client write

1

2

3

Page 78: Introduction to Apache Accumulo

78© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Page 79: Introduction to Apache Accumulo

79© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Page 80: Introduction to Apache Accumulo

80© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Page 81: Introduction to Apache Accumulo

81© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Page 82: Introduction to Apache Accumulo

82© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Page 83: Introduction to Apache Accumulo

83© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Recovery timing

• Tunable time to detection – increases network load

• Size of outstanding write ahead logs

Page 84: Introduction to Apache Accumulo

84© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Client write

1

2

3

4

Page 85: Introduction to Apache Accumulo

85© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Accumulo-based App Layout

Page 86: Introduction to Apache Accumulo

86© 2015 Cloudera licensed CC-BY-SA 2.0

What’s the catch?

Page 87: Introduction to Apache Accumulo

87© 2015 Cloudera, Inc. licensed CC-BY-SA 2.0

Gaps

• Still requires application updates to use API – no interactive SQL bindings*

• No Disaster Recovery – coming in next minor release

Page 88: Introduction to Apache Accumulo

Thank you.

Mr. Mean photo from mockup is © 2004 Flickr user aznewbeginning; cc-by-sa 2.0 https://flic.kr/p/4uzdRc