efficient columnar storage with apache parquet · efficient columnar storage with apache parquet...

34
EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET Apache: Big Data North America 2017 Ranganathan Balashanmugam, Aconex

Upload: others

Post on 20-May-2020

24 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

EFF I C I EN T C OL U M N AR STORAG E W ITH

APACH E PARQU ET

Apache: Big Data North America 2017

Ranganathan Balashanmugam, Aconex

Page 2: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

“The Tables Have Turned.”

Apache: Big Data North America 2017

Page 3: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

“The Tables Have Turned.” - 90°

Apache: Big Data North America 2017

Page 4: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

ANALY T I CAL OV ER TRAN SAC T I ON AL

Context

Page 5: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

S IM PLE STRUCTURE

A B C

a1 b1 c1

a2 b2 c2

a3 b3 c3

a1 b1 c1 a2 b2 c2 a3 b3 c3

row

a1 a2 a3 b1 b2 b3 c1 c2 c3

columnar

Page 6: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

“Optimizing the disk seeks.”

Page 7: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

“The Tables Have Turned.” - 90°

Apache: Big Data North America 2017

Efficient writesEfficient reads

Page 8: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

S IM PLE STRUCTURE

A B C

a1 b1 c1

a2 b2 c2

a3 b3 c3

a1 b1 c1 a2 b2 c2 a3 b3 c3

row

a1 a2 a3 b1 b2 b3 c1 c2 c3

columnar

Page 9: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

Apache: Big Data North America 2017

NESTED AND REPEATED STRUCTURESHow to preserve in column store?

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

Url: ‘http://a'

Name:

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

Page 10: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Colum

nar

Page 11: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

“Get these performance benefits for nested structures into Hadoop ecosystem.”

prob

lem

stat

emen

t

Page 12: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

MOT IVAT I ON

* Allow complex nested data structures

* Very efficient compression and encoding schemes

* Support many frameworks

Page 13: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

“Columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or

programming language.”

Page 14: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

DES IGN GOALS

* Interoperability

* Space efficiency

* Query efficiency

Page 15: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

parquet

avro thrift protobuf pig hive ….

avro thrift protobuf pig hive ….

object model

converters

storage format

object models

parquet binary file

src: https://techmagie.wordpress.com/2016/07/15/data-storage-and-modelling-in-hadoop/

column readers

scalding

Page 16: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Page 1

Page 2

… Page n

parquet file

header - Magic number (4 bytes) : “PAR1”

row group 0

column a column b column c

row group 1

…row group n

footer

Page 1

Page 2

… Page n

Page 1

Page 2

… Page n

src: https://github.com/Parquet/parquet-format

* good enough for compression efficiency * 8KB - 1 MB * good enough to read

* group of rows * max size buffered while writing * 50 MB - 1GB

* data of one column in row group * can be read independently

Page 17: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

footer

src: https://github.com/Parquet/parquet-format

file metadata (ThriftCompactProtocol)- version - schema

row group 0 metadata

footer length (4 bytes)

column 0

- total byte size - total rows

* type/path/encodings/codec * num of values * compressed/uncompressed size * data page offset * index page offset

column 1

* …

Magic number (4 bytes): “PAR1”

row group …

column 0

column 1

* …

* …

Page 18: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

page

src: https://techmagie.wordpress.com/2016/07/15/data-storage-and-modelling-in-hadoop/

repetition levels

definition levels

values

page header (ThriftCompactProtocol)

enco

ded

and co

mpr

esse

d

Page 19: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

Schema

Page 20: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

Url: ‘http://a'

Name:

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

Documents

Page 21: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

“Can we represent it in columnar former efficiently and read them back to their original nested data structure?”

prob

lem

stat

emen

t

Page 22: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

“Can we represent it in columnar former efficiently and read them back to their original nested data structure?”

Dremel encoding

Page 23: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

fill all the nulls

Page 24: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

COLUMNS R D VALUE

Links.Forward 0 2 20

Links.Forward[1] 1 2 40

Links.Forward[2] 1 2 60

Links.Forward 0 2 80

In the path to the field, what is the last repeated field ?

In the path to the field, how

many defined fields ?

Page 25: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

In the path to the field, what is the last repeated field ?

In the path to the field, how

many defined fields ?

COLUMNS R D VALUE

Name.Language.Code en-us

Name.Language[1].Code en

Name[1].[Language].[Code] null

Name[2].Language.Code en-gb

Name.Language.Code null

Page 26: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

In the path to the field, what is the last repeated field ?

In the path to the field, how

many defined fields ?

COLUMNS R D VALUE

Name.Language.Code 0 2 en-us

Name.Language[1].Code en

Name[1].[Language].[Code] null

Name[2].Language.Code en-gb

Name.Language.Code null

Page 27: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

In the path to the field, what is the last repeated field ?

In the path to the field, how

many defined fields ?

COLUMNS R D VALUE

Name.Language.Code 0 2 en-us

Name.Language[1].Code 2 2 en

Name[1].[Language].[Code] null

Name[2].Language.Code en-gb

Name.Language.Code null

Page 28: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

In the path to the field, what is the last repeated field ?

In the path to the field, how

many defined fields ?

COLUMNS R D VALUE

Name.Language.Code 0 2 en-us

Name.Language[1].Code 2 2 en

Name[1].[Language].[Code] 1 1 null

Name[2].Language.Code en-gb

Name.Language.Code null

Page 29: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

Document 2

DocId: 20

Links:

Backward: 10

Backward: 30

Forward: 80

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://c'

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

In the path to the field, what is the last repeated field ?

In the path to the field, how

many defined fields ?

COLUMNS R D VALUE

Name.Language.Code 0 2 en-us

Name.Language[1].Code 2 2 en

Name[1].[Language].[Code] 1 1 null

Name[2].Language.Code 1 2 en-gb

Name.Language.Code 0 1 null

Page 30: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

Document 1

DocId: 10

Links:

Forward: 20

Forward: 40

Forward: 60

<Backward>: NULL

Name:

Language:

Code: ‘en-us’

Country: ‘us’

Language:

Code: ‘en’

<Country>: NULL

Url: ‘http://a'

Name:

<Language>:

<Code>: NULL

<Country>: NULL

Url: ‘http://b'

Name:

Language:

Code: ‘en-gb’

Country: ‘gb’

<Url>: NULL

message Document {

required int64 DocId;

optional group Links {

repeated int64 Backward;

repeated int64 Forward;

}

repeated group Name {

repeated group Language {

required string Code;

optional string Country;

}

optional string Url;

}

}

COLUMNS R D VALUE

DocId 0 0 10

Links.Forward 0 2 20

Links.Forward[1] 1 2 40

Links.Forward[2] 1 2 60

Links.[Backward] 0 1 null

Name.Language.Code 0 2 en-us

Name.Language.Country 0 3 us

Name.Language[1].Code 2 2 en

Name.Language[1].[Country] 2 2 null

Name.Url 0 2 http://a

Name[1].[Language].[Code] 1 1 null

Name[1].[Language].[Country] 1 1 null

Name[1].Url 1 2 http://b

Name[2].Language.Code 1 2 en-gb

Name[2].Language.Country 1 3 gb

Name[2].[Url] 1 1 null

Page 31: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

A B C

a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

a5 b5 c5

A B C

a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

a5 b5 c5

A B C

a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

a5 b5 c5

OPT I M I Z AT I ONS

projection push down predicate push down read required data

fetch required columns filter records while reading

Page 32: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

EN C OD I N G* Plain

* Dictionary Encoding

* Run Length Encoding/ Bit-Packing Hybrid

* Delta Encoding

* Delta-length byte array

* Delta Strings

Page 33: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Apache: Big Data North America 2017

COMPRESS ION TRADE OFF

* Storage

* IO

* Bandwidth

* CPU time

Page 34: EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET · EFFICIENT COLUMNAR STORAGE WITH APACHE PARQUET ... parquet avro thrift protobuf pig hive …. avro thrift protobuf pig hive …

Thank you

Apache: Big Data North America 2017

Ran.ga.na.than B@ran_than