hybrid memory aerospike @ paypal€¦ · aerospike user summit 2018 18 in-memory nosql (50tb) total...

30
AEROSPIKE USER SUMMIT 2018 1 Hybrid Memory Aerospike @ PayPal Saibabu Devabhaktuni Sr. Director of Database, Systems, and Storage PayPal Athreya Gopalakrishna Sr. MTS Engineer, Database Engineering PayPal

Upload: others

Post on 20-Oct-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

  • AERO SPIKE USER SUM M IT 2018

    1

    Hybrid Memory Aerospike @ PayPal

    Saibabu Devabhaktuni

    Sr. Director of Database,

    Systems, and Storage

    PayPal

    Athreya Gopalakrishna

    Sr. MTS Engineer, Database

    Engineering

    PayPal

  • AERO SPIKE USER SUM M IT 2018

    2

    PayPal Fraud Detection System

    2

    • Analytical system

    • Built on relational and KV system

    • Requires Low Latency and High throughput

    • 1-200KB avg. object sizes

    • 1-2ms@99

    • Millions of transactions/sec

    • Trillions of keys

    • 100s of Terabytes of Storage

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    3

    3

    Happy with the fast

    machine

    Fly Speed = 1300+mph

    Cost = US$200M

    Passengers = 100/flight

    (Analogy: In-Memory NoSQL DB)

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    4

    4

    0.5

    3

    12

    18

    0

    2

    4

    6

    8

    10

    12

    14

    16

    18

    20

    2011 2012 2013 2014

    STORAGE GROWTH TREND – FRAUD SYSTEMS

    Data Growth (TB)

    In-Memory

    NoSQL

    : )

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    5

    5

    0.1

    4

    16

    40

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    2011 2012 2013 2014

    KEY SPACE GROWTH TREND

    Keys (Billions)

    Fly Speed = 1300+mph

    Cost = US$200M

    Passengers = 100/flight

    : ) In-MemoryNoSQL

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    6

    6

    0.1 4 16 40 20

    240

    1200

    2400

    0

    500

    1000

    1500

    2000

    2500

    3000

    2011 2012 2013 2014 2015 2016 2017 2018

    Keys (Billions)

    : (

    Growth Estimates

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    7

    7

    Fly Speed = 650 mph

    Cost = US$400 M

    Passengers = 850/flight

    In search of a new machine

    (NoSQL DB)

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    8

    Back to drawing board(in-memory vs. memory-first vs. hybrid-memory)

    8

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    9

    9

    In-Memory Memory-First Hybrid-Memory

    Database

    Memory

    Client

    Read

    Path Database

    Memory

    Client

    Database

    Disk

    Read Path

    With

    Cache Hit

    Read Path

    With

    Cache Miss

    Read Path – Latency and Consistency

    Latency - Low

    Throughput - Consistent

    Latency – Low and High

    Throughput - Inconsistent

    Database

    Memory

    Client

    Database

    Disk

    Read Path

    Latency – Low

    Throughput - Consistent

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    10

    DC1, RF=2

    50TB

    C1

    50TB

    C2

    DC2, RF=2

    50TB

    C3

    50TB

    C4

    DC3, RF=2

    50TB

    C5

    50TB

    C6

    eg. Designing a 50TB A/A Database

    X-DC replicationX-DC replication

    X-DC replication

    A/A

    10

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    11

    In-Memory Database for 50TB

    Servers ~ 1024

    Racks~18

    Price ~$12M

    DC1

    DC2 DC3

    (Predictable performance)

    11

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    12

    Memory-First Database for 50TB

    in-memory(Predictable performance)

    Servers ~ 1024

    Price ~$15M

    Racks~18

    DC1

    DC2 DC33.2TB 3.2TB

    12

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    13

    Memory-First Database for 50TB

    (memory first + Disk )(Unpredictable performance)

    Servers ~ 120 Price ~$1.8M

    Racks~3

    DC1

    DC2 DC33.2TB 3.2TB

    13

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    14

    Hybrid Memory Database for 50TB(Predictable performance)

    Servers ~ 120 Price ~$1.8M

    Racks~3

    DC1

    DC2 DC33.2TB 3.2TB

    14

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    15

    Aerospike as NoSQL Database

    • Written in C

    • Simple KV database

    • Distributed shared nothing architecture

    • Operates In-Memory or Hybrid-Memory Modes

    • Low write amplification

    • SSD optimized for consistent performance

    • High storage density

    • Low CPU utilization

    • UDF for server side computations

    15

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    ©2018 PayPal Inc. Confidential and proprietary.16

    Aerospike Architecture

    Distributed K/V

    Shared nothing

    Auto-Failover

    Auto-Rebalancing

    Cross DC Replication

    UDFnode

    node node

    node

    node

    node

    node

    node

    node

    Key Differentiators in NoSQL space

    Ground up, Designed for SSDs.

    (Achieves – Even wear and tear on Device)

    Proprietary file system

    (Achieves – Consistent Device Latency,

    Follows Device throughput)

    Hybrid Storage – Predictable capacity.

    (Achieves – Enables huge storage on SSD)

    Key, Value

    16

  • AERO SPIKE USER SUM M IT 2018

    17

    Hybrid Memory

    Used = 15GB, 5%

    Used =

    268GB

    5%

    Load 250M

    1KB Value Size

    Raw Data = 250GB

    Max Write = 100K TPS

    CPU = 5%

    Mem = 15GB

    Disk = 268GB

    System Efficiency

    Util 100%

    Used = 384GB

    40 cores

    1.92TB x 1

    SATA

    RI

    System configuration

    17

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    18

    In-Memory NoSQL (50TB)

    Total Cost = $12.5m

    Aerospike (Hybrid Memory NoSQL)(50TB)

    18 Racks# of server 1024

    3 Racks# of servers = 120

    Total Cost = $3.5m

    99.5 ATB 99.99+ ATB

    Pe

    rfo

    rma

    nc

    e C

    os

    t

    3x

    Sp

    ac

    e/P

    ow

    er

    8x

    Ava

    ila

    bil

    ity

    10x

    • Consistent Performance

    • Stable

    • Avg Throughput – 1M TPS

    • Ultra low latency (~200us)

    • Inconsistent Performance

    • Unstable

    • Average Throughput – 200K TPS

    • Low latency (~1ms) Before After

    18

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    19

    0.5 3 12

    18 30

    90

    210

    450

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    500

    2011 2012 2013 2014 2015 2016 2017 2018

    Fraud Detection systems with Aerospike

    Data Growth (TB)

    In-Memory

    NoSQL

    Hybrid-Memory

    with

    Aerospike

    19

    ©2018 PayPal Inc. Confidential and proprietary.

  • 20 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Concorde + Airbus 380

    MISSION POSSIBLE

  • AERO SPIKE USER SUM M IT 2018

    21

    Operations@Scale

    21

    ©2018 PayPal Inc. Confidential and proprietary.

  • 22 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Cluster Size

    Data Unavailable = { m (m-1) } / { n (n - 1) }

    m = Total nodes unavailable.

    n = Total nodes in the cluster.

    • MAX = 20 Nodes

    • RF=2

    • Mode=AP

    • 256GB RAM, 6.4TB SSD/Node

  • 23 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Deployment Topologies (Active/Active Or Active/Standby)

    DC1 DC2

    DC3

    DC1 DC2

    DC3

  • 24 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Monitoring at Scale

  • 25 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Database Lifecycle Automation

    1000+ servers

    48 clusters

    Ansible Automation API(s) – Programmatic OR Human interface

    - prepare_new_node - backup

    - wipe_out_server - restore

    - create_cluster - prepare_tools_node

    - reconfigure_database - remove_node

    - add_node - validate_cluster

    - change_password - switch_paxos_protocol

    - reset_cluster_name - turn_off_clear_port

    - apply_os_patch_rolling – apply_os_patch_single_node

    3 Datacenters

  • 26 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Database Reporting

    Capacity Report

    XDC/Latency ReportInventory Report

  • 27 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    Security

  • 28 A E R O S P I K E U S E R S U M M I T | Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc

    • 1 Click cluster provisioning

    • 3-5 Node cluster

    • Dev/QA

    Next

    • NVMf

    • DB on Containers

    • Strong consistency evaluation, Possibly larger clusters

    Cloud

  • AERO SPIKE USER SUM M IT 2018

    29

    Thank You

    ©2018 PayPal Inc. Confidential and proprietary.

  • AERO SPIKE USER SUM M IT 2018

    30

    Q & A

    ©2018 PayPal Inc. Confidential and proprietary.