deep dive into the native multi model database arangodb

80
Deep dive into the native multi-model database ArangoDB Frank Celler Percona Live 2016, Santa Clara, 20 April 2016 www.arangodb.com

Upload: arangodb

Post on 09-Jan-2017

1.407 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Deep dive into the native multi model database ArangoDB

Deep dive into the nativemulti-model database ArangoDB

Frank Celler

Percona Live 2016, Santa Clara, 20 April 2016

www.arangodb.com

Page 2: Deep dive into the native multi model database ArangoDB

Overview

Page 3: Deep dive into the native multi model database ArangoDB

is amulti-model DatabaseFeatures

is a document store, a key/value store and a graph database,offers convenient queries (via HTTP/REST and AQL),including joins between different collections,and graph queries,with configurable consistency guarantees using transactions.

=⇒ Allows polyglot persistence with multiple instances of a single technology.

Page 4: Deep dive into the native multi model database ArangoDB

is amulti-model DatabaseFeatures

is a document store, a key/value store and a graph database,offers convenient queries (via HTTP/REST and AQL),including joins between different collections,and graph queries,with configurable consistency guarantees using transactions.

=⇒ Allows polyglot persistence with multiple instances of a single technology.

Page 5: Deep dive into the native multi model database ArangoDB

is extensible by JavaScript Code

The Foxx Microservice FrameworkAllows you to extend the HTTP/REST API by your own routes, which youimplement in JavaScript running on the database server, with direct accessto the C++ DB engine.

Unprecedented possibilities for data centric services:custom-made complex queries or authorizationsschema-validationpush feeds, etc.

Page 6: Deep dive into the native multi model database ArangoDB

is extensible by JavaScript Code

The Foxx Microservice FrameworkAllows you to extend the HTTP/REST API by your own routes, which youimplement in JavaScript running on the database server, with direct accessto the C++ DB engine.

Unprecedented possibilities for data centric services:custom-made complex queries or authorizationsschema-validationpush feeds, etc.

Page 7: Deep dive into the native multi model database ArangoDB

is a Data Center Operating System App

These days, computing clusters run Data Center Operating Systems.

IdeaDistributed applications can be deployed as easily as one installs a mobileapp on a phone.Cluster resource management is automatic.This leads to significantly better resource utilization.Fault tolerance, self-healing and automatic failover is guaranteed.

Page 8: Deep dive into the native multi model database ArangoDB

is a Data Center Operating System App

These days, computing clusters run Data Center Operating Systems.IdeaDistributed applications can be deployed as easily as one installs a mobileapp on a phone.

Cluster resource management is automatic.This leads to significantly better resource utilization.Fault tolerance, self-healing and automatic failover is guaranteed.

Page 9: Deep dive into the native multi model database ArangoDB

is a Data Center Operating System App

These days, computing clusters run Data Center Operating Systems.IdeaDistributed applications can be deployed as easily as one installs a mobileapp on a phone.Cluster resource management is automatic.This leads to significantly better resource utilization.Fault tolerance, self-healing and automatic failover is guaranteed.

Page 10: Deep dive into the native multi model database ArangoDB

Details

Page 11: Deep dive into the native multi model database ArangoDB

The Multi-Model Approach

Multi-model databaseA multi-model database combines a document store with a graphdatabase and is at the same time a key/value store,with a common query language for all three data models.

Important:is able to compete with specialised products on their turfallows for polyglot persistence using a single database technologyIn a microservice architecture, there will be several different deployments.

Page 12: Deep dive into the native multi model database ArangoDB

The Multi-Model Approach

Multi-model databaseA multi-model database combines a document store with a graphdatabase and is at the same time a key/value store,with a common query language for all three data models.

Important:is able to compete with specialised products on their turfallows for polyglot persistence using a single database technologyIn a microservice architecture, there will be several different deployments.

Page 13: Deep dive into the native multi model database ArangoDB

performance

https://www.arangodb.com/2015/10/benchmark-postgresql-mongodb-arangodb/

Page 14: Deep dive into the native multi model database ArangoDB

Why is multi-model possible at all?

Document stores and key/value storesDocument stores: have primary key, are key/value stores.Without using secondary indexes, performance is nearly as good as withopaque data instead of JSON.Good horizontal scalability can be achieved for key lookups.

Page 15: Deep dive into the native multi model database ArangoDB

horizontal scalabilityExperiment: Single document writes (1kB / doc) on cluster of sizes 8 to 80 machi-nes (64 to 640 vCPUs), another 4 to 40 load servers, running on AWS.

https://mesosphere.com/blog/2015/11/30/arangodb-benchmark-dcos/

Page 16: Deep dive into the native multi model database ArangoDB

Why is multi-model possible at all?Document stores and graph databasesGraph database: would like to associate arbitrary data with vertices andedges, so JSON documents are a good choice.

A good edge index, giving fast access to neighbours.This can be a secondary index.Graph support in the query language.Implementations of graph algorithms in the DB engine.

https://www.arangodb.com/2016/04/index-free-adjacency-hybrid-indexes-graph-databases/

Page 17: Deep dive into the native multi model database ArangoDB

Replication and Sharding

ArangoDB provides (Version 2.8, January 2016)Sharding with automatic data distribution,easy setup of (asynchronous) replication (cluster and single),fault tolerance by automatic failover,full integration with Apache Mesos and Mesosphere DC/OS.

Work in progress (Version 3.0, RC in April 2016):synchronous replication in cluster mode,zero administration by a self-repairing and self-balancing cluster architecture.

Page 18: Deep dive into the native multi model database ArangoDB

Replication and Sharding

ArangoDB provides (Version 2.8, January 2016)Sharding with automatic data distribution,easy setup of (asynchronous) replication (cluster and single),fault tolerance by automatic failover,full integration with Apache Mesos and Mesosphere DC/OS.

Work in progress (Version 3.0, RC in April 2016):synchronous replication in cluster mode,zero administration by a self-repairing and self-balancing cluster architecture.

Page 19: Deep dive into the native multi model database ArangoDB

Data-Center Operating Systems

Resource ManagementInstallation should be as easy as possibleintegration into the resource management of data-centergives better resource utilisation,full integration with Apache Mesos and Mesosphere DC/OS

Work in progressMesosphere DC/OS a very mature, Open-Source solutionlater this year integration also for Kubernetes, Docker-Swarm

Page 20: Deep dive into the native multi model database ArangoDB

Data-Center Operating Systems

Resource ManagementInstallation should be as easy as possibleintegration into the resource management of data-centergives better resource utilisation,full integration with Apache Mesos and Mesosphere DC/OS

Work in progressMesosphere DC/OS a very mature, Open-Source solutionlater this year integration also for Kubernetes, Docker-Swarm

Page 21: Deep dive into the native multi model database ArangoDB

About Mesosphere’s DC/OS

https://dcos.io

Page 22: Deep dive into the native multi model database ArangoDB

Installing Mesosphere’s DC/OS

https://dcos.io

Page 23: Deep dive into the native multi model database ArangoDB

Installing Mesosphere’s DC/OS

https://dcos.io

Page 24: Deep dive into the native multi model database ArangoDB

Powerful query language

AQLThe built in Arango Query Language allows

complex, powerful and convenient queries,with transaction semantics,allowing to do joins,AQL is independent of the driver used andoffers protection against injections by design.

Page 25: Deep dive into the native multi model database ArangoDB

Extensible through JavaScript

The Foxx Microservice FrameworkAllows you to extend the HTTP/REST API by your own routes, which youimplement in JavaScript running on the database server, with direct accessto the C++ DB engine.

Unprecedented possibilities for data centric services:complex queries or authorizations, schema-validation, push feeds, etc.easy deployment via web interface or REST API,automatic API description through Swagger =⇒ discoverability of services.

Page 26: Deep dive into the native multi model database ArangoDB

Extensible through JavaScript

The Foxx Microservice FrameworkAllows you to extend the HTTP/REST API by your own routes, which youimplement in JavaScript running on the database server, with direct accessto the C++ DB engine.

Unprecedented possibilities for data centric services:complex queries or authorizations, schema-validation, push feeds, etc.easy deployment via web interface or REST API,automatic API description through Swagger =⇒ discoverability of services.

Page 27: Deep dive into the native multi model database ArangoDB

Use Cases

Page 28: Deep dive into the native multi model database ArangoDB

Use case: Aircraft fleet management

Page 29: Deep dive into the native multi model database ArangoDB

Use case: Aircraft fleet managementOne of our customers uses ArangoDB to

store each part, component, unit or aircraft as a documentmodel containment as a graphthus can easily find all parts of some componentkeep track of maintenance intervalsperform queries orthogonal to the graph structurethereby getting good efficiency for all needed queries

http://radar.oreilly.com/2015/07/data-modeling-with-multi-model-databases.html

Page 30: Deep dive into the native multi model database ArangoDB

Use case: rights management

Page 31: Deep dive into the native multi model database ArangoDB

Use case: rights managementRight managements in relational model is hard:

looks like a forest at firstthen exceptions pop-upone company sub-contracts another for a special stationan engineer works for two companiessome-one needs special permissions when being a proxymuch easier expressed as graph structure

Page 32: Deep dive into the native multi model database ArangoDB

Use case: e-commerce

Page 33: Deep dive into the native multi model database ArangoDB

Use case: e-commerceAboutYou uses ArangoDB to

create channels showing new productsallow recommendation to friendscelebrities presenting new fashionblog about fashion productsnightly business analysisnews stream

https://www.arangodb.com/case-studies/aboutyou-data-driven-personalization-with-arangodb/

Page 34: Deep dive into the native multi model database ArangoDB

Action

Page 35: Deep dive into the native multi model database ArangoDB

First deployment: a simple key/value store

A key/value storeOne collection “data”, indexes on “value” (sorted) and “name” (hash).

Single document requestsIndexes possibleRange queries possible

Page 36: Deep dive into the native multi model database ArangoDB

Second deployment: a Microservice as a Foxx app

A Foxx MicroserviceSimple TODO app, deployed from app store with web UI.

REST/JSON API availableSwagger generates API description automatically

Page 37: Deep dive into the native multi model database ArangoDB

Third deployment: a single server graph database

A Graph DatabaseGraph “worldCountry” with vertex collection “worldVertex” and edgecollection “worldEdges”, links from cities to countries to continents toworld.

Show some graph traversals.Show graph viewer.

Page 38: Deep dive into the native multi model database ArangoDB

Fourth deployment: a multi-model application

A multi-model databaseSome data from a web shop.

Show some queries.

Page 39: Deep dive into the native multi model database ArangoDB

AQLInternals

Page 40: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from user

Parse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 41: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)

Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 42: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parameters

First optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 43: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.

Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 44: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)

Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 45: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPs

Reason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 46: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in cluster

Optimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 47: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPs

Estimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 48: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending cost

Instanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 49: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engine

Distribute and link up engines on different serversExecute plan, provide cursor API

Page 50: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different servers

Execute plan, provide cursor API

Page 51: Deep dive into the native multi model database ArangoDB

Life of a queryText and query parameters come from userParse text, produce abstract syntax tree (AST)Substitute query parametersFirst optimisation: constant expressions, etc.Translate AST into an execution plan (EXP)Optimise one EXP, producemany, potentially better EXPsReason about distribution in clusterOptimise distributed EXPsEstimate costs for all EXPs, and sort by ascending costInstanciate “cheapest” plan, i.e. set up execution engineDistribute and link up engines on different serversExecute plan, provide cursor API

Page 52: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Query→ EXP

Black arrows aredependenciesThink of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???

Page 53: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Query→ EXPBlack arrows aredependencies

Think of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???

Page 54: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Query→ EXPBlack arrows aredependenciesThink of a pipeline

Each node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???

Page 55: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Query→ EXPBlack arrows aredependenciesThink of a pipelineEach node providesa cursor API

Blocks of “Items”travel through thepipelineWhat is an “item”???

Page 56: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Query→ EXPBlack arrows aredependenciesThink of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipeline

What is an “item”???

Page 57: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Query→ EXPBlack arrows aredependenciesThink of a pipelineEach node providesa cursor APIBlocks of “Items”travel through thepipelineWhat is an “item”???

Page 58: Deep dive into the native multi model database ArangoDB

Pipeline and items

FOR a IN collA EnumerateCollection a

EnumerateCollection b

Singleton

Calculation xx

FOR b IN collB

LET xx = a.x Items have vars a, xx

Items have no vars

Items are the thingies traveling through the pipeline.

An item holds values of those variables in the current frameThus: Items look differently in different parts of the planWe always deal with blocks of items for performance reasons

Page 59: Deep dive into the native multi model database ArangoDB

Pipeline and items

FOR a IN collA EnumerateCollection a

EnumerateCollection b

Singleton

Calculation xx

FOR b IN collB

LET xx = a.x Items have vars a, xx

Items have no vars

Items are the thingies traveling through the pipeline.An item holds values of those variables in the current frame

Thus: Items look differently in different parts of the planWe always deal with blocks of items for performance reasons

Page 60: Deep dive into the native multi model database ArangoDB

Pipeline and items

FOR a IN collA EnumerateCollection a

EnumerateCollection b

Singleton

Calculation xx

FOR b IN collB

LET xx = a.x Items have vars a, xx

Items have no vars

Items are the thingies traveling through the pipeline.An item holds values of those variables in the current frameThus: Items look differently in different parts of the plan

We always deal with blocks of items for performance reasons

Page 61: Deep dive into the native multi model database ArangoDB

Pipeline and items

FOR a IN collA EnumerateCollection a

EnumerateCollection b

Singleton

Calculation xx

FOR b IN collB

LET xx = a.x Items have vars a, xx

Items have no vars

Items are the thingies traveling through the pipeline.An item holds values of those variables in the current frameThus: Items look differently in different parts of the planWe always deal with blocks of items for performance reasons

Page 62: Deep dive into the native multi model database ArangoDB

Execution plans

FOR a IN collA

RETURN {x: a.x, z: b.z}

EnumerateCollection a

EnumerateCollection b

Calculation xx == b.y

Filter xx == b.y

Singleton

Calculation xx

Return {x: a.x, z: b.z}

Calc {x: a.x, z: b.z}

FILTER xx == b.y

FOR b IN collB

LET xx = a.x

Page 63: Deep dive into the native multi model database ArangoDB

Move filters upFOR a IN collA

FOR b IN collBFILTER a.x == 10FILTER a.u == b.vRETURN {u:a.u,w:b.w}

The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.However, the number of items trave-ling in the pipeline is decreased.Note that the two FOR statementscould be interchanged!

Singleton

EnumColl a

EnumColl b

Calc a.x == 10

Return {u:a.u,w:b.w}

Filter a.u == b.v

Calc a.u == b.v

Filter a.x == 10

Page 64: Deep dive into the native multi model database ArangoDB

Move filters upFOR a IN collA

FOR b IN collBFILTER a.x == 10FILTER a.u == b.vRETURN {u:a.u,w:b.w}

The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.

However, the number of items trave-ling in the pipeline is decreased.Note that the two FOR statementscould be interchanged!

Singleton

EnumColl a

EnumColl b

Calc a.x == 10

Return {u:a.u,w:b.w}

Filter a.u == b.v

Calc a.u == b.v

Filter a.x == 10

Page 65: Deep dive into the native multi model database ArangoDB

Move filters upFOR a IN collA

FILTER a.x < 10FOR b IN collB

FILTER a.u == b.vRETURN {u:a.u,w:b.w}

The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.However, the number of items trave-ling in the pipeline is decreased.

Note that the two FOR statementscould be interchanged!

Singleton

EnumColl a

Return {u:a.u,w:b.w}

Filter a.u == b.v

Calc a.u == b.v

Calc a.x == 10

EnumColl b

Filter a.x == 10

Page 66: Deep dive into the native multi model database ArangoDB

Move filters upFOR a IN collA

FILTER a.x < 10FOR b IN collB

FILTER a.u == b.vRETURN {u:a.u,w:b.w}

The result and behaviour does notchange, if the first FILTER is pulledout of the inner FOR.However, the number of items trave-ling in the pipeline is decreased.Note that the two FOR statementscould be interchanged!

Singleton

EnumColl a

Return {u:a.u,w:b.w}

Filter a.u == b.v

Calc a.u == b.v

Calc a.x == 10

EnumColl b

Filter a.x == 10

Page 67: Deep dive into the native multi model database ArangoDB

Remove unnecessary calculationsFOR a IN collA

LET L = LENGTH(a.hobbies)FOR b IN collB

FILTER a.u == b.vRETURN {h:a.hobbies,w:b.w}

The Calculation of L is unnecessary!(since it cannot throw an exception).Therefore we can just leave it out.

Singleton

EnumColl a

Calc L = ...

EnumColl b

Calc a.u == b.v

Filter a.u == b.v

Return {...}

Page 68: Deep dive into the native multi model database ArangoDB

Remove unnecessary calculationsFOR a IN collA

LET L = LENGTH(a.hobbies)FOR b IN collB

FILTER a.u == b.vRETURN {h:a.hobbies,w:b.w}

The Calculation of L is unnecessary!

(since it cannot throw an exception).Therefore we can just leave it out.

Singleton

EnumColl a

Calc L = ...

EnumColl b

Calc a.u == b.v

Filter a.u == b.v

Return {...}

Page 69: Deep dive into the native multi model database ArangoDB

Remove unnecessary calculationsFOR a IN collA

FOR b IN collBFILTER a.u == b.vRETURN {h:a.hobbies,w:b.w}

The Calculation of L is unnecessary!(since it cannot throw an exception).

Therefore we can just leave it out.

Singleton

EnumColl a

EnumColl b

Calc a.u == b.v

Filter a.u == b.v

Return {...}

Page 70: Deep dive into the native multi model database ArangoDB

Remove unnecessary calculationsFOR a IN collA

FOR b IN collBFILTER a.u == b.vRETURN {h:a.hobbies,w:b.w}

The Calculation of L is unnecessary!(since it cannot throw an exception).Therefore we can just leave it out.

Singleton

EnumColl a

EnumColl b

Calc a.u == b.v

Filter a.u == b.v

Return {...}

Page 71: Deep dive into the native multi model database ArangoDB

Use index for FILTER and SORTFOR a IN collA

FILTER a.x > 17 &&a.x <= 23 &&a.y == 10

SORT a.y, a.xRETURN a

Assume collA has a skiplist index on “y”and “x” (in this order), then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.

The result will automatically be sorted byy and then by x.

Singleton

EnumColl a

Filter ...

Calc ...

Sort a.y, a.x

Return a

Page 72: Deep dive into the native multi model database ArangoDB

Use index for FILTER and SORTFOR a IN collA

FILTER a.x > 17 &&a.x <= 23 &&a.y == 10

SORT a.y, a.xRETURN a

Assume collA has a skiplist index on “y”and “x” (in this order),

then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.

The result will automatically be sorted byy and then by x.

Singleton

EnumColl a

Filter ...

Calc ...

Sort a.y, a.x

Return a

Page 73: Deep dive into the native multi model database ArangoDB

Use index for FILTER and SORTFOR a IN collA

FILTER a.x > 17 &&a.x <= 23 &&a.y == 10

SORT a.y, a.xRETURN a

Assume collA has a skiplist index on “y”and “x” (in this order), then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.

The result will automatically be sorted byy and then by x.

Singleton

Sort a.y, a.x

Return a

IndexRange a

Page 74: Deep dive into the native multi model database ArangoDB

Use index for FILTER and SORTFOR a IN collA

FILTER a.x > 17 &&a.x <= 23 &&a.y == 10

SORT a.y, a.xRETURN a

Assume collA has a skiplist index on “y”and “x” (in this order), then we can readoff the half-open interval between{ y: 10, x: 17 } and{ y: 10, x: 23 }from the skiplist index.

The result will automatically be sorted byy and then by x.

Singleton

Return a

IndexRange a

Page 75: Deep dive into the native multi model database ArangoDB

Data distribution in a cluster

Requests

DBserver DBserver DBserver

CoordinatorCoordinator

4 2 5 3 11

The shards of a collection are distributed across the DB servers.

The coordinators receive queries and organise their execution

Page 76: Deep dive into the native multi model database ArangoDB

Data distribution in a cluster

Requests

DBserver DBserver DBserver

CoordinatorCoordinator

4 2 5 3 11

The shards of a collection are distributed across the DB servers.The coordinators receive queries and organise their execution

Page 77: Deep dive into the native multi model database ArangoDB

Scatter/gather

EnumerateCollection

Page 78: Deep dive into the native multi model database ArangoDB

Scatter/gather

Remote

EnumShard

Remote Remote

EnumShard

Remote

Concat/Merge

Remote

EnumShard

Remote

Scatter

Page 79: Deep dive into the native multi model database ArangoDB

Scatter/gather

Remote

EnumShard

Remote Remote

EnumShard

Remote

Concat/Merge

Remote

EnumShard

Remote

Scatter

Page 80: Deep dive into the native multi model database ArangoDB

Linkshttps://www.arangodb.com

https://docs.arangodb.com/cookbook/index.html

https://github.com/ArangoDB/guesser

http://mesos.apache.org/

https://mesosphere.com/

https://mesosphere.github.io/marathon/

https://dcos.io