apexmeetup geode - talk1 2016-03-17

35
Apache Geode, and Pivotal's leadership role in open sourcing (Gemfire) Nitin Lamba (incubating)

Upload: apache-apex-organizer

Post on 23-Feb-2017

181 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: ApexMeetup Geode - Talk1 2016-03-17

Apache Geode,and Pivotal's leadership role

in open sourcing (Gemfire)

Nitin Lamba

(incubating)

Page 2: ApexMeetup Geode - Talk1 2016-03-17

Pivotal’s Open Source strategyWhat is Apache Geode?HistoryDifferentiatorsBasic Concepts

ResourcesQ & A

Agenda

2

Page 3: ApexMeetup Geode - Talk1 2016-03-17

3

Page 4: ApexMeetup Geode - Talk1 2016-03-17

4

In 2015, Pivotal granted the components of its Big Data Suite to open source

6 Million Lines of Code4 new open source communities

Page 5: ApexMeetup Geode - Talk1 2016-03-17

5

May 2015 Sept 2015

Sept 2015Oct 2015

Page 6: ApexMeetup Geode - Talk1 2016-03-17

From GEMFIRE to GEODE…

6

Page 7: ApexMeetup Geode - Talk1 2016-03-17

A distributed, memory-based data management platform for data oriented apps that need:• high performance, scalability,

resiliency and continuous availability

• fast access to critical data sets• location-aware distributed data

processing• event-driven data architecture

What is GEODE?

7

Page 8: ApexMeetup Geode - Talk1 2016-03-17

• 1000+ systems in production (real customers)

• Cutting edge use cases

Incubating but ROCK solid…

8

<2000 2004 2008 2012 2016

Early drivers• Data Volumes• Margins/

transactions• IT maintenance

costs • Elasticity needs

Real-time needs• Real-time response• Time to market

needs• Flexible Data Models • Persistent+In-

memory

Global Data• Visibility across

DC• Fast Ingest• Device to

enterprise • Uptime (always

on)

Open Source!• Apache Incubation• Gemfire > Geode• Geode M1 release• 1st Geode Summit

Financial Services

US DoD

Trade Clearing

Travel Portal

Online Gambling

TelcosManufacturingAuto

InsurancePayroll processing

Rail systems

Page 9: ApexMeetup Geode - Talk1 2016-03-17

…with both SCALE and SPEED, …

9

40K

Transactionsper second

3TB Data

in-memory

17B Records

in-memory

120K

Concurrent users

Page 10: ApexMeetup Geode - Talk1 2016-03-17

… and impacting a LOT of people!

10

China RailwayCorporation

Indian Railway

s

17%

19%

36%

of the world population

Page 11: ApexMeetup Geode - Talk1 2016-03-17

High-level Architecture

11

Powerful app development kit• APIs: Java & REST• Adapters: Redis, Lucene*,

Spark*, …

Multiple persistence options• Filesystem, RDBMS or HDFS*• Sync: read-through, write-

through• Async: write-behind

Durable <K,V> cache/ store• Data replicated or partitioned• Redundant storage in-memory/

disk• Flexible data retention policiesÎ

Loca

tor

Serv

er

Serv

er

Serv

er

Serv

er +

A Peer-2-Peer in-memory Distributed System

REST

* Experimental and waiting community feedback

Page 12: ApexMeetup Geode - Talk1 2016-03-17

• Minimize copying• Minimize contention points• Run user code in-process• Partitioning & parallelism• Avoid disk seeks• Automated benchmarks

What makes it go FAST?

12

Page 13: ApexMeetup Geode - Talk1 2016-03-17

• Cache• Region• Member• Client Cache• Persistence• Functions

Let’s talk about a few BASIC CONCEPTS…

13

Page 14: ApexMeetup Geode - Talk1 2016-03-17

• In-memory storage and management for your data

• Configurable through XML, Java API or CLI

• Collection of Region

What is a CACHE?

14

Page 15: ApexMeetup Geode - Talk1 2016-03-17

• Distributed java.util.Map on steroids (Key/Value)

• Consistent API regardless of where or how data is stored

• Observable (reactive) • Highly available, redundant on

cache Member (s).

What is a REGION?

15

Page 16: ApexMeetup Geode - Talk1 2016-03-17

• Local, Replicated or Partitioned• In-memory or persistent• Redundant• LRU • Overflow

Region: Types & Options

16

LOCALLOCAL_HEAP_LRULOCAL_OVERFLOWLOCAL_PERSISTENTLOCAL_PERSISTENT_OVERFLOWPARTITIONPARTITION_HEAP_LRUPARTITION_OVERFLOWPARTITION_PERSISTENTPARTITION_PERSISTENT_OVERFLOWPARTITION_PROXYPARTITION_PROXY_REDUNDANTPARTITION_REDUNDANTPARTITION_REDUNDANT_HEAP_LRUPARTITION_REDUNDANT_OVERFLOWPARTITION_REDUNDANT_PERSISTENTPARTITION_REDUNDANT_PERSISTENT_OVERFLOWREPLICATEREPLICATE_HEAP_LRUREPLICATE_OVERFLOWREPLICATE_PERSISTENTREPLICATE_PERSISTENT_OVERFLOWREPLICATE_PROXY

Page 17: ApexMeetup Geode - Talk1 2016-03-17

• Durability• WAL for efficient writing• Consistent recovery• Compaction

Persistent Regions

17

Server 1 Server N

Page 18: ApexMeetup Geode - Talk1 2016-03-17

• A process that has a connection to the system

• A process that has created a cache

• Embeddable within your application

What is a MEMBER?

18

Client

Locator

Server

Page 19: ApexMeetup Geode - Talk1 2016-03-17

• A process connected to the Geode server(s)

• Can have a local copy of the data• Run OQL queries on local

data• Can be notified about events

on the servers

What is a CLIENT CACHE?

19

Page 20: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Shared Nothing

20

Server 3Server 2Server 1

Page 21: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Shared Nothing

21

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2Primary

Secondary

Page 22: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Shared Nothing

22

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2Primary

Secondary

Page 23: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Shared Nothing

23

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2Primary

Secondary

Page 24: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Shared Nothing

24

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2Primary

Secondary

B3

B2

Server 1 waits for others when it starts

Page 25: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Shared Nothing

25

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2Primary

Secondary

Fetches missed operations on restart

Page 26: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Operational Logs

26

Create

k1->v1Create k2->v2

Modifyk1->v3

Create k4->v4

Modify

k1->v5Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to

operation log

Page 27: ApexMeetup Geode - Talk1 2016-03-17

Persistence - Operational Logs: Compaction

27

Create

k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify

k1->v5Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to

operation log

Copy live

data forward

Page 28: ApexMeetup Geode - Talk1 2016-03-17

• Used for distributed concurrent processing (Map/Reduce, stored procedure)

• Highly available• Data oriented• Member oriented

Functions

28

Page 29: ApexMeetup Geode - Talk1 2016-03-17

Functions

29

Page 30: ApexMeetup Geode - Talk1 2016-03-17

30

• Check out: http://geode.incubator.apache.org

• Subscribe: [email protected]

• Download: http://geode.incubator.apache.org/releases/

Join the Community!

Page 31: ApexMeetup Geode - Talk1 2016-03-17

31

Thank you!

Page 32: ApexMeetup Geode - Talk1 2016-03-17

Additional Slides

32

Page 33: ApexMeetup Geode - Talk1 2016-03-17

Built for PERFORMANCE…

33

A Re

ads

A Up

date

s

B Re

ads

B Up

date

s

C Re

ads

D In

serts

D Re

ads

F Re

ads

F Up

date

s0

200,000

400,000

600,000

800,000

1,000,000

Cassandra Geode

YCSB Workloads

Oper

atio

ns p

er s

econ

d

Page 34: ApexMeetup Geode - Talk1 2016-03-17

…and horizontal, consistent SCALABILITY!

34

Horizontal scaling for reads, consistent latency and CPU

2 4 6 8 100

1.25

2.5

3.75

5

6.25

0

4.5

9

13.5

18

speedup latency (ms) CPU %

Server Hosts

Spee

dup

• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers• Partitioned region with redundancy and 1K data size

Page 35: ApexMeetup Geode - Talk1 2016-03-17

High Availability

35