accumulo summit 2015: building aggregation systems on accumulo [leveraging accumulo]

36
Building Scalable Aggregation Systems Accumulo Summit April 28th, 2015 Gadalia O’Bryan and Bill Slacum [email protected] [email protected]

Upload: accumulo-summit

Post on 15-Jul-2015

318 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Building Scalable

Aggregation Systems

Accumulo Summit

April 28th, 2015

Gadalia O’Bryan and Bill Slacum

[email protected]

[email protected]

Page 2: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Outline

• The Value of Aggregations

• Abstractions

• Systems

• Details

• Demo

• References/Additional Information

Page 3: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]
Page 4: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Aggregation provides a means of turning billions of pieces of raw data into condensed, human-consumable

information.

Page 5: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Aggregation of Aggregations

Time Series

Set Size/Cardinality

Top-K

Quantiles

Density/Heatmap

16.3k Unique

Users

G1

G2

Page 6: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Abstractions

Page 7: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

1

2

3

4

10

+

+

+

=

Concept from (P1)

Page 8: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

1

2

3

4

3

+ +

=

7

=

10

=

+

We can parallelize integer addition

Page 9: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Associative + Commutative

Operations

• Associative: 1 + (2 + 3) = (1 + 2) + 3

• Commutative: 1 + 2 = 2 + 1

• Allows us to parallelize our reduce (for

instance locally in combiners)

• Applies to many operations, not just

integer addition.

• Spoiler: Key to incremental aggregations

Page 10: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

{a,

b}

{b, c}

{a, c}

{a}

{a, b,

c}

+ +

=

{a, c}

=

{a, b,

c}

=

+

We can also parallelize the “addition” of other types, like Sets, as

Set Union is associative

Page 11: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Monoid Interface

• Abstract Algebra provides a formal foundation for what we can casually observe.

• Don’t be thrown off by the name, just think of it as another trait/interface.

• Monoids provide a critical abstraction to treat aggregations of different types in the same way

Page 12: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Many Monoid Implementations

Already Exist

• https://github.com/twitter/algebird/

• Long, String, Set, Seq, Map, etc…

• HyperLogLog – Cardinality Estimates

• QTree – Quantile Estimates

• SpaceSaver/HeavyHitters – Approx Top-K

• Also easy to add your own with libraries

like stream-lib [C3]

Page 13: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Serialization• One additional trait we need our

“aggregatable” types to have is that we

can serialize/deserialize them.

1

2

3

4

3

+ +

=

7

=

1

0

=

+

1) zero()

2) plus()

3) plus()

4) serialize()

6) deserialize()

5) zero()

7) plus()

9) plus()

3

78) deserialize()

Page 14: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

These abstractions enable a

small library of reusable code to

aggregate data in many parts of

your system.

Page 15: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Systems

Page 16: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

SQL on Hadoop

• Impala, Hive, SparkSQL

milliseconds seconds minutes

large many few

seconds minutes hours

billions millions thousands

Query Latency

# of Users

Freshness

Data Size

Page 17: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Online Incremental Systems

• Twitter’s Summingbird [PA1, C4], Google’s Mesa [PA2],

Koverse’s Aggregation Framework

milliseconds seconds minutes

large many few

seconds minutes hours

billions millions thousands

Query Latency

# of Users

Freshness

Data Size

SM

K

Page 18: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Online Incremental Systems:

Common Components

• Aggregations are computed/reduced

incrementally via associative operations

• Results are mostly pre-computed for so

queries are inexpensive

• Aggregations, keyed by dimensions, are

stored in low latency, scalable key-value

store

Page 19: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Summingbird Program

Summingbird

Data

HDFS

Queues Storm

Topology

Hadoop

Job

Online

KV store

Batch

KV store

Client

LibraryClient

Reduce

Reduce

Reduce

Reduce

Page 20: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Mesa

Data (batches)

Colossus

Query

Server

61

62

91

92

Singletons

61-70

Cumulatives

61-80

61-90

0-60

Base

Compaction

Worker

Reduce

Reduce

Client

Page 21: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Koverse

Data

Apache Accumulo

Koverse

Server

Hadoop JobReduce

Reduce

ClientRecords Aggregates

Min/Maj

Compation

IteratorReduce

Scan

Iterator

Reduce

Page 22: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Details

Page 23: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Ingest (1/2)

• We bulk import RFiles over writing via a BatchWriter

• Failure case is simpler as we can retry whole batch in case an aggregation job fails or a bulk import fails

• BatchWriters can be used, but code needs to be written handle Mutations that are uncommitted and there’s no roll back for successful commits

Page 24: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Ingest (2/2)

• As a consequence of importing (usually

small) RFiles, we will be compacting more

• In testing (20 nodes, 200+ jobs/day), we

have not had to tweak compaction

thresholds nor strategies

• Can possibly be attributed to relatively

small amounts of data being held at any

given time due to reduction

Page 25: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Accumulo Iterator

• Combiner Iterator:

A SortedKeyValueIterator that combines the

Values for different versions (timestamp) of a

Key within a row into a single Value. Combiner

will replace one or more versions of a Key and

their Values with the most recent Key and a

Value which is the result of the reduce method.

Page 26: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Our Combiner

• We can re-use Accumulo's Combiner type here:

override def reduce:(key: Key, values: Iterator[Value]) Value = {

val sum = agg.reduceAll(values.map(v => agg deserialize v))

return (key, sum)}

• Our function has to be commutative because major compactions will often pick smaller files to combine, which means we only see discrete subsets of data in an iterator invocation.

Page 27: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Accumulo Table Structure

row colf colq visibility timestamp value

field1Name\x1Ffiel

d1Value\x1Ffield2

Name\x1Ffield2Val

ue...

Aggregation

Type

relation visibility timestamp Serialized

aggregation

results

Example: origin\x1FBWI count: [U] 6074Example: origin\x1FBWI topk:destination [U] {“DIA”: 1}Example: origin\x1FBWI\x1Fdate\x1F20150427 count: [U] 104

Page 28: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Visibilities (1/2)

• Easy to store, bit tougher to query

• Data can be stored at separate visibilities

• Combiner logic has no concept of visibility,

it only loops over a given

PartialKey.ROW_COLFAM_COLQUAL

• We know how to combine values (Longs,

CountMinSketchs), but how do we

combine visibilities?

Page 29: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Visibilities (2/2)

• Say we have some data on Facebook photo albums:– facebook\x1falbum_size count: [public] 800

– facebook\x1falbum_size count: [private] 100

• Combined value would be 900

• But, what should we return for the visibility of public + private? We need more context to properly interpret this value.

• Alternatively, we can just drop it

Page 30: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Queries

• This schema is geared towards point

queries.

• Order of fields matters.

• GOOD “What are the top-k destinations

from BWI?”

• NOT GOOD“What are all the dimensions

and aggregations I have for BWI?”

Page 31: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Demo

Page 32: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

References

Presentations

P1. Algebra for Analytics - https://speakerdeck.com/johnynek/algebra-for-analytics

Code

C1. Algebird - https://github.com/twitter/algebird

C2. Simmer - https://github.com/avibryant/simmer

C3. stream-lib https://github.com/addthis/stream-lib

C4. Summingbird - https://github.com/twitter/summingbird

Papers

PA1. Summingbird: A Framework for Integrating Batch and Online MapReduce Computations http://www.vldb.org/pvldb/vol7/p1441-boykin.pdf

PA2. Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42851.pdf

PA3. Monoidify! Monoids as a Design Principle for Efficient MapReduce Algorithms http://arxiv.org/abs/1304.7544

Video

V1. Intro To Summingbird - https://engineering.twitter.com/university/videos/introduction-to-summingbird

Graphics

G1. Histogram Graphic - http://www.statmethods.net/graphs/density.html

G2. Heatmap Graphic - https://www.mapbox.com/blog/twitter-map-every-tweet/

G3. The Matrix Background - http://wall.alphacoders.com/by_sub_category.php?id=198802

Page 33: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Backup Slides

Page 34: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Monoid Examples

Page 35: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Monoid Examples

Page 36: Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Accumulo]

Aggregation Flow

RowId: hour:2014_08_24_09| client:WebCF: Count CQ:Value: 3

RowId: client:AndroidCF: Count CQ:Value: 1

RowId: client:AndroidCF: Count CQ:Value: 5

RowId: client:iPhoneCF: Count CQ:Value: 6

kv_records kv_aggregates

New Records from Import Jobs client: iPhonetimestamp: 1408935773...

client: Androidtimestamp: 1408935871...

client: Webtimestamp: 1408935792...

Periodic, Incremental MapReduce Jobs

(like the current Stats Job) read Records

and emit Aggregate KVs based on the

Aggregate configuration for the Collection

Aggregate( onKey( “client”, “hour”, “client”) produce( Count) prepare( (“timestamp”, “hour”, BinByHour()))

Aggregate Configuration is a type-safe,

Scala object. Code is sent to the server

as a String, where it is compiled (not

executed). The serialized object is

passed to the MR job to generate KVs

from Records. Contains the dimensions

(onKeys), aggregation operation

(produce), and optional projections

(prepare) which can be built-in functions

or custom Scala closures. We envision

an UI building these objects in the future.

Map

Combine

Emit KVs.Key = dimension + operationValue = Serialized Monoid Aggregator

Aggregation Reduction

Reduce

Aggregation Reduction

RFiles

RowId: client:iPhoneCF: Count CQ:Value: 3

RowId: client:AndroidCF: Count CQ:Value: 5

RowId: hour:2014_08_24_09| client:AndroidCF: Count CQ:Value: 2

MinC

MajC

Aggregation Reduction

Aggregation Reduction

UserQuery

Scan

Iterator

Aggregation Reduction

{ key: “client:iPhone”, produce: Count }

{ key: “client:iPhone”, produce: Count, value: 9 }

Aggregation Reduction is the same common code in 5 places. For

Aggregates with the same Key, the Values are reduced based on the

operation (Sum, Set, Cardinality Est., etc). The Values are always

serialized objects that implement the MonoidAggregator interface.

Adding a new aggregation operation will impact a single class only -

no new Iterators or MR code.

RowId: hour:2014_08_24_09| client:WebCF: Count CQ:Value: 8

Aggregate Queries are simple point

queries for a single KV. If the user wants

something like an enumeration of “client”

values, they will use a Set or Top-K

operation and the single value will contain

the answer with no range scans required.

The API may support batching multiple

keys per request to efficiently support

queries to build timeseries (e.g., counts

for each hour in the day)