deconstructing the database - qcon san francisco 2018 · deconstructing the database ... magnitude,...

46
Deconstructing the Database Rich Hickey Most programs are outside the bounds of and single process we write in any single FP language

Upload: vuonghanh

Post on 21-May-2018

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Deconstructing the Database

Rich Hickey

Most programs are outside the bounds of and single process we write in any single FP language

Page 2: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

What is Datomic?

• A new database

• A sound model of information, with time

• Provides database as a value to applications

• Bring declarative programming to applications

• Focus on reducing complexity

Not going to do a full rationalization or overviewFocus on information important for Datomic, not necessarily for all use of dbs/stores

Page 3: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

DB Complexity

• Stateful

• Same query, different results

• no basis

• Over there

• ‘Update’ poorly defined

• Places

Inherent complexity in state

Page 4: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Update

• What does update mean?

• Does the new replace the old?

• Granularity? new ___ replace the old ___

• Visibility?

Granularity and replacement the big issues

Page 5: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Manifestations

• Wrong programs

• Scaling problems

• Round-trip fears

• Fear of overloading server

• Coupling, e.g. questions with reporting

read committed vs repeatable, serializable

Page 6: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Consistency and Scale

• What’s possible?

• Distributed redundancy and consistency?

• Elasticity

• Inconsistency huge source of complexity

dynamo and bigtable

Page 7: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Information and Time

• Old-school memory and records

• The kind you remember

... and keep

• Auditing and more

Page 8: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Perception and Reaction

• No polling

• Consistent

Page 9: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Coming to TermsValue

• An immutable magnitude, quantity, number... or immutable composite thereof

Identity

• A putative entity we associate with a series of causally related values (states) over time

State

• Value of an identity at a moment in time

Time

• Relative before/after ordering of causal values

Page 10: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

v1

F

v2

F

v3

F

v4

Process events (pure functions)

Observers/perception/memory

States (immutable values)Identity

(succession of states)

Epochal Time Model

Page 11: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Implementing Values

• Persistent data structures

• Trees

• Structural sharing

Page 12: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Structural Sharing

PastNext

Page 13: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Process events (pure functions)

Observers/perception/memory

Identity (succession of

states)

Place Model

DB Connection

Transactions

Queries

The Database Place

F F F

We should recognize - same problems as OOConflates identity and value, collapses time

Page 14: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

v1

F

v2

F

v3

F

v4

Process events (pure functions)

Observers/perception/memory

States (immutable values)Identity

(succession of states)

Epochal Time Model

DB Connection

Transactions

DB Values

Queries

This is what we want - transactions are functions of db valuesqueries as well

Page 15: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Traditional Database

cache

Server

IndexingTrans-actions

Query

App Process

I/O

AppORM?

Caching policy?

StringsDDL + DMLResult Sets Serialized

???Serialized

???

Disk

what’s makes up a database?cache over DB access, disk locality was important

Page 16: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

The Choices

• Coordination

• how much, and where?

• process requires it

• perception shouldn’t

• Immutability

• sine qua non

coordinate up front or laterimmutability advantages with eventually consistent systems

Page 17: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Approach

• Move to information model

• Split process and perception

• Immutable basis in storage

• Novelty in memory

Page 18: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Information

• Inform

• ‘to convey knowledge via facts’

• ‘give shape to (the mind)’

• Information

• the facts

Page 19: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

• Fact - ‘an event or thing known to have happened or existed’

• From: factum - ‘something done’

• Must include time

• Remove structure (a la RDF)

• Atomic Datom

• Entity/Attribute/Value/Transaction(time)

Facts

don’t get more flexibility by trading tables for documentsfactum - “something done”

Page 20: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Database State

• The database as an expanding value

• An accretion of facts

• The past doesn’t change - immutable

• Process requires new space

• Fundamental move away from places

What is the state - snapshot of a referential model?

Page 21: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Accretion

• Root per transaction doesn’t work

• Latest values include past as well

• The past is sub-range

• Important for information model

can we just do shared structure from before on disk?

Page 22: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Process

• Reified

• Primitive representation of novelty

• Assertions and retractions of facts

• Minimal

• Other transformations expand into those

Important to accretion that novelty representation is minimal

Page 23: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Deconstruction

• Process

• Transactions

• Indexing

• O

• Perception/Reaction

• Query

• Indexes

• I

Server

IndexingTrans-actions

Query I/O Disk

Page 24: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

State

• Must be organized to support query

• Sorted set of facts

• Maintaining sort live in storage - bad

• BigTable - mem + storage merge

• occasional merge into storage

• persistent trees

Databases are about leverage

Page 25: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Indexing

• Maintaining sort live in storage - bad

• BigTable et al:

• Accumulate novelty in memory

• Current view: mem + storage merge

• Occasional integrate mem into storage

Releases memory

Page 26: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Transactions and Indexing

IndexMerging

Trans-actions

Log Data Segments

Live Index

Index Data Segments

Storage

Novelty

Page 27: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Perception

Live Index Storage

Index Data Segments

Novelty

Just a merge join. Any coordination required? Contention?How does live index get updated?Live vs periodic now separate decision

Page 28: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Components

• Transactor

• Peers

• Your app servers, analytics machines etc

• Redundant storage service

Physical architecture

Page 29: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Datomic ArchitectureApp Server Process

Peer Lib

Query

Cache

App

Live IndexComm

App Server ProcessPeer Lib

Query

Cache

App

Live IndexComm

Transactor

Indexing Trans-actions

App Server ProcessPeer Lib

Query

Cache

App

Live IndexComm

Transactor

Indexing Trans-actions

Data Segments

Data SegmentsRedundant

segment storage

Storage Service

Segment storage

memcached cluster

(optional)

standby

But, storage now remote, slow? No - cache everywhere

Page 30: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Transactor

• Accepts transactions

• Expands, applies, logs, broadcasts

• Periodic indexing, in background

• Indexing creates garbage

• Storage GC

Page 31: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Peer Servers

• Peers directly access storage service

• Have own query engine

• Have live mem index and merging

• Two-tier cache

• Datoms w/object values (on heap)

• Segments (memcached)

Page 32: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Consistency and Scale

• Process/writes go through transactor

• traditional server scaling/availability

• Immutability supports consistent reads

• without transactions

• Query scales with peers

• Elastic/dynamic e.g. auto-scaling

consistency as in db satisfies consistency predicateloopholes 7 and 10?

Page 33: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Memory Index

• Persistent sorted set

• Large internal nodes

• Pluggable comparators

• 2 sorts always maintained

• EAVT, AEVT

• plus AVET, VAET

Page 34: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Storage

• Log of tx asserts/retracts (in tree)

• Various covering indexes (trees)

• Storage service/server requirements

• Data segment values (K->V)

• atoms (consistent read)

• pods (conditional put)

Page 35: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Structural Sharing

PastNext

Page 36: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

What’s in a DB Value?

EAVT

tVeAETAEVT

db atom

nextTasOfT

Lucene index

history

live Lucene

sinceT

index

db valuelive Storage

Hierarchical Cache

Roots

Memory index(live window)

Storage-backed index

Identity

Value

Value can be lazily loaded, since source immutable

Page 37: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Index in Storage

SortedDatoms

Index Root of key->dir

T42

VeAETAEVT AVET LuceneEAVT

dirs

segs

Index ref

Identity

Value

Page 38: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Datomic on Riak+ ZooKeeper

• Riak

redundant, distributed, highly available

durable

eventually consistent

• ZooKeeper

redundant, durable,

consistent (ordered ops + CAS)

Page 39: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Datomic on Riak+ ZooKeeper

RiakZooKeeper

Index ref

Log ref

Catalog refCoord ref

Identities Values

pointer swap mentioned by Eric Brewer this morning

Page 40: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Riak Usage• Everything put into Riak is immutable

• N=3, W=2, DW=2

• R=1, not-found-ok = false

‘first found’ semantics

• There or not

no vector clocks, siblings etc

• No speculative lookup

What notion of consistency?Application-level predicate

Page 41: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Transactor

Indexing Trans-actions

App ProcessApp Process

Peer App Server ProcessPeer Lib

Query

Cache

App

Live IndexComm

Transactor

Indexing Trans-actions

Data Segments

Redundant segment storage

Storage Server/Service

Segment storage

Peer ServicePeer Service

Peer REST Server

Query

Cache

RESTServer

Live IndexComm

App ProcessApp ProcessClient App Process (any language)

App

RESTClient

Data Segments

memcached cluster

(optional)

HTTP + Server-Sent Events

Standby

Full Datomic Stack

Page 42: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Stable Bases

• Same query, same results

• db permalinks!

• communicable, recoverable

• Multiple conversations about same value

//PeerDatabase db = connection.db().asOf(1000);Peer.q(aQuery, db);

//ClientGET /data/mem/test/1000/datoms?index=aevt

basis

Value of values

Page 43: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

DB Values• Time travel

• db.asOf - past

• db.since - windowed

• db.with(tx) - speculative

• dbs are arguments to query, not implicit

• mock with datom-shaped data:

[[:fred :likes "Pizza"] [:sally :likes "Ice cream"]]

Page 44: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

DB Simplicity Benefits• Epochal state

• Coordination only for process

• Transactions well defined

• Functional accretion

• Freedom to relocate/scale storage, query

• Extensive caching

• Process events

Page 45: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

The Database as a Value

• Dramatically less complex

• More powerful

• More scalable

• Better information model

Page 46: Deconstructing the Database - QCon San Francisco 2018 · Deconstructing the Database ... magnitude, quantity, number... or immutable composite thereof Identity ... •Pluggable comparators

Thanks for Listening!