slides for the apache geode hands-on meetup and hackathon announcement

31
Hands-on Introduction & Hackathon Kickoff Ashvin Agrawal William Markito @william_markito @aasoj Powered by Pivotal Open Source Hub (POSH) (incubating)

Upload: pivotal

Post on 04-Aug-2015

554 views

Category:

Documents


3 download

TRANSCRIPT

Hands-on Introduction & Hackathon Kickoff

Ashvin Agrawal William Markito@william_markito@aasoj

Powered byPivotal Open Source Hub (POSH)

(incubating)

• Hackathon Details • Apache Geode Introduction

• History • Key features and components • Roadmap

• Hands-on lab • Build & run • Starting a cluster • Using docker for clustering • Your first app

• Q&A

2

Agenda

Hackathon details

Powered byPivotal Open Source Hub (POSH)

http://ambitious-apps.challengepost.com/

4

Introduction

A distributed, memory-based data management platform for data oriented apps that need: • high performance, scalability, resiliency and continuous

availability • fast access to critical data set • location aware distributed data processing • event driven data architecture

5

Introduction

6

One size fits all ?

Cost of sorting is nlog(n)

7

One size fits all ?

Cost of sorting is nlog(n)

• Data quality and quantity differences • Eventual consistency • Response time expectation • Scalability challenges: disk, memory, network and

external systems

• 1000+ systems in production (real customers) • Cutting edge use cases

8

Incubating… but rock solid

2004 2008 2014

•  Massive increase in data volumes

•  Falling margins per transaction

•  Increasing cost of IT maintenance

•  Need for elasticity in systems

•  Financial Services Providers (every major Wall Street bank)

•  Department of Defense

•  Real Time response needs •  Time to market constraints •  Need for flexible data

models across enterprise •  Distributed development •  Persistence + In-memory

•  Global data visibility needs •  Fast Ingest needs for data •  Need to allow devices to

hook into enterprise data •  Always on

•  Largest travel Portal •  Airlines •  Trade clearing •  Online gambling

•  Largest Telcos •  Large mfrers •  Largest Payroll processor •  Auto insurance giants •  Largest rail systems on

earth

• 17 billion records in memory • GE Power & Water's Remote Monitoring & Diagnostics Center

• 3 TB operational data in-memory, 400 TB archived • China Railways

• 4.6 Million transactions a day / 40K transactions a second • China Railways

9

Incubating… but rock solid

• Performance optimized persistence

• Configurable consistency

• Elastic capacity

• Latency minimizing distribution

• Heterogenous deployment

Designed for High Performance

10

+/-

L2 ~10 ns, memory ~100 ns, network <1ms, disk ~10ms

• Cache • Region • Member • Client Cache • Functions • Listeners

11

Concepts

• Cache

• In-memory storage and management for your data

• Configurable through XML, Spring, Java API or CLI

• Collection of Region

12

Concepts

Region

Region

Region

Cache

JVM

• Region

• Distributed java.util.Map on steroids (Key/Value)

• Consistent API regardless of where or how data is stored

• Observable (reactive)

• Highly available, redundant on cache Member (s).

13

Concepts

Region

Cache

java.util.Map

JVM

Key Value

K01 May

K02 Tim

• Region

• Local, Replicated or Partitioned

• In-memory or persistent

• Redundant

• LRU

• Overflow

14

Concepts

Region

Cache

java.util.Map

JVM

Key Value

K01 May

K02 Tim

Region

Cache

java.util.Map

JVM

Key Value

K01 May

K02 Tim

LOCAL  LOCAL_HEAP_LRU  LOCAL_OVERFLOW  LOCAL_PERSISTENT  LOCAL_PERSISTENT_OVERFLOW  PARTITION  PARTITION_HEAP_LRU  PARTITION_OVERFLOW  PARTITION_PERSISTENT  PARTITION_PERSISTENT_OVERFLOW  PARTITION_PROXY  PARTITION_PROXY_REDUNDANT  PARTITION_REDUNDANT  PARTITION_REDUNDANT_HEAP_LRU  PARTITION_REDUNDANT_OVERFLOW  PARTITION_REDUNDANT_PERSISTENT  PARTITION_REDUNDANT_PERSISTENT_OVERFLOW  REPLICATE  REPLICATE_HEAP_LRU  REPLICATE_OVERFLOW  REPLICATE_PERSISTENT  REPLICATE_PERSISTENT_OVERFLOW  REPLICATE_PROXY

• Persistent Regions

• Durability

• WAL for efficient writing

• Consistent recovery

• Compaction

15

Concepts

Modify k1->v5

Create k6->v6

Create k2->v2

Create k4->v4 Oplog2.crf

Member 1

Modify k4->v7 Oplog3.crf

Put k4->v7

Region

Cache

java.util.Map

JVM

Key Value

K01 May

K02 Tim

Region

Cache

java.util.Map

JVM

Key Value

K01 May

K02 Tim

Server 1 Server N

• Member

• A process that has a connection to the system

• A process that has created a cache

• Embeddable within your application

16

Concepts

Client

Locator

Server

• Client cache

• A process connected to the Geode server(s)

• Can have a local copy of the data

• Can be notified about events on the servers

17

Concepts

Application

GemFire Server

Region

Region

Region Client Cache

• Functions

• Used for distributed concurrent processing (Map/Reduce, stored procedure)

• Highly available

• Data oriented

• Member oriented

18

Concepts

Submit (f1)

f1 , f2 , … fn

Execute Functions

19

Concepts

Server

Server

FunctionService.onRegion.withFilter.execute ResultCollector.getResult

Server Distributed System

execute

Server

Server

6

1

result

execute

execute

result result

2

5

3

4 3 4

Server

Partitioned Region Data Store - X

Partitioned Region Data Store - Y

Partitioned Region Data Store - Z

Partitioned Region Data Accessor

Partitioned Region Data Accessor

filter = Keys X, Y Client Region

• Functions

• Listeners

• CacheWriter / CacheListener

• AsyncEventListener (queue / batch)

• Parallel or Serial

• Conflation

20

Concepts

Hands on

• Clone & Build

22

Hands-on: Build & run

git  clone  https://github.com/apache/incubator-­‐geode  cd  incubator-­‐geode./gradlew  build  -­‐Dskip.tests=true

• Start a server

cd  gemfire-­‐assembly/build/install/apache-­‐geode    ./bin/gfsh    gfsh>  start  locator  -­‐-­‐name=locator    gfsh>  start  server  -­‐-­‐name=server    gfsh>  create  region  -­‐-­‐name=myRegion  -­‐-­‐type=REPLICATE

23

Hands-on: Docker

&

• Containers • FreeBSD Jails (2000) • Solaris Zones (2004) • Docker (2013)

• Operating system level virtualization • Isolated user space instances

24

* https://linuxcontainers.org/

Hands-on: Docker

25

Container vs VM

“..while the hypervisor abstracts the entire device, containers just abstract the operating system kernel"

Hands-on: Docker & Compose

26

• Single instance

docker  run  -­‐it  apachegeode/geode:nightly  gfsh

• Cluster

docker-­‐compose  up

• Scale

docker-­‐compose  scale  server=3

Hands-on: Application

27

• Teeny URL • Fast response time • Statistics

• Hits • User agent ? • IPs ?

• URL will last for 5 minutes • Distribute data & load • Highly scalable

createURL

getURLstats

• HDFS Persistence • Off-heap memory storage • Lucene Search • Spark Integration • Cloud Foundry service

28

Roadmap

• Code • New features • Bug fixes • Writing tests

• Documentation • Wiki • Web site • User guide

29

How to Contribute

• Community • Join the mailing list

• Ask or answer • Join our HipChat • Become a speaker • Finding bugs • Testing an RC/Beta

• JIRA https://issues.apache.org/jira/browse/GEODE

• Wiki cwiki.apache.org/confluence/display/GEODE

• GitHub https://github.com/apache/incubator-geode

• Mailing lists mail-archives.apache.org/mod_mbox/incubator-geode-dev/

30

Links

31

Thank youhttp://geode.incubator.apache.org

https://github.com/Pivotal-Open-Source-Hub