Download - An Introduction to Apache Geode (incubating)
2© Copyright 2015 Pivotal. All rights reserved. 2© Copyright 2015 Pivotal. All rights reserved.
Open Sourcing GemFire
An Introduction to Apache Geode (incubating)The open source distributed in-memory database
Anthony [email protected]@metatype
3© Copyright 2015 Pivotal. All rights reserved.
Topics
GemFire history, architecture, and use cases
Geode, the open source core of GemFire
Example application and demo
Technical deep dive
5© Copyright 2015 Pivotal. All rights reserved.
2004 2008 2014
• Massive increase in data volumes
• Falling margins per transaction
• Increasing cost of IT maintenance
• Need for elasticity in systems
• Financial Services Providers (every major Wall Street bank)
• Department of Defense
• Real Time response needs• Time to market constraints • Need for flexible data
models across enterprise• Distributed development• Persistence + In-memory
• Global data visibility needs• Fast Ingest needs for data• Need to allow devices to
hook into enterprise data• Always on
• Largest travel Portal• Airlines• Trade clearing• Online gambling
• Largest Telcos• Large mfrers• Largest Payroll processor• Auto insurance giants• Largest rail systems on
earth
Hybrid Transactional/Analytics grids
Our GemFire Journey Over The Years
6© Copyright 2015 Pivotal. All rights reserved.
Apps at Scale Have Unique Needs
Pivotal GemFire is the distributed, in-memory database for apps that need:
1. Elastic scale-out performance
2. High performance database capabilities in distributed systems
3. Mission critical availability and resiliency
4. Flexibility for developers to create unique applications
5. Easy administration of distributed data grids
7© Copyright 2015 Pivotal. All rights reserved.
1. Elastic Scale-Out Performance
Nodes
Ops / SecLinear scalability
Elastic capacity +/-
Latency-minimizing data distribution
China Railway Corporation
“The system is operating with solid performance and uptime. Now, we have a reliable, economically sound production system that supports record volumes and has room to grow”
Dr. Jiansheng Zhu, Vice Director of China Academy of Railway Sciences
• 4.5 million ticket purchases & 20 million users per day.
• Spikes of 15,000 tickets sold per minute, 40,000 visits per second.
8© Copyright 2015 Pivotal. All rights reserved.
2. High Performance Database Capabilities
Performance-optimized
persistence
Configurable consistency Partitioned Replicated Disabled
Distributed & continuous queries
Distributed transactions, indexing, archival
Newedge
“Our global deployment of Pivotal GemFire’s distributed cache gives me a single version of the trade – resolving hard-to-test-for synchronization issues that exist within any globally distributed business application architecture”
Michael Benillouche, Global Head of Data Management
9© Copyright 2015 Pivotal. All rights reserved.
3. Mission Critical Availability and Resiliency
Cluster to cluster WAN connectivity
Cluster resilience & failover
XX Gire
“We can track and collect money at our 4,000+ kiosks and branches – even without a reliable Internet connection. GemFire provides the core data grid and a significant amount of related functionality to help us handle this unreliable network problem”
Gustavo Valdez, Chief of Architecture and Development
• 19 million payment transactions per month
• 4000+ points of sale with intermittent Internet connectivity
WAN
10© Copyright 2015 Pivotal. All rights reserved.
4. Flexibility for Developers to Create Unique Apps• Data Structures:
– User-defined classes– Documents (JSON)
• Native language support:– Java, C++, C#
• API’s– Java: Hashmap– Memcache– Spring Data GemFire– C++ Serialization API’s
• Embedded query authoring– Object Query Language (OQL)
• Publish & subscribe framework for continous query & reliable asynchronous event queues
• App-server embedded functionality:– Web app session state caching– L2 Hibernate
11© Copyright 2015 Pivotal. All rights reserved.
5. Easy Administration of Distributed Data Grids• Auto tuning of distributed computing resources to optimize
performance
• Cluster monitoring dashboard– Cluster and node status & performance
• Offline performance statistics analysis tool– View historical logs and events to diagnose performance and resource bottlenecks
• Command-line tool for taking action on clusters and nodes
12© Copyright 2015 Pivotal. All rights reserved.
What makes it fast?
Design center is RAM not HDD
Really demanding customers
Avoid and minimize, particularly on the critical read/write paths:– Network hops, copying data, contention, distributed locking, disk seeks, garbage
Lots and lots of testing– Establish and monitor performance baselines– Distributed systems testing is difficult!
13© Copyright 2015 Pivotal. All rights reserved.
Horizontal Scaling for GemFire (Geode) ReadsWith Consistent Latency and CPU
• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers
• Partitioned region with redundancy and 1K data size
0
2
4
6
8
10
12
14
16
18
0
1
2
3
4
5
6
2 4 6 8 10
Sp
ee
du
p
Server Hosts
speedup
latency (ms)
CPU %
14© Copyright 2015 Pivotal. All rights reserved.
GemFire (Geode) 3.5-4.5X Faster Than Cassandra for YCSB
15© Copyright 2015 Pivotal. All rights reserved.
Roadmap
HDFS persistence
Off-heap storage
Lucene indexes
Spark integration
Cloud Foundry service
…and other ideas from the Geode community!
16© Copyright 2015 Pivotal. All rights reserved. 16© Copyright 2015 Pivotal. All rights reserved.
Use cases and patterns
17© Copyright 2015 Pivotal. All rights reserved.
“Low touch” Usage Patterns
Simple template for TCServer, TC, App servers
Shared nothing persistence, Global session stateHTTP Session management
Set Cache in hibernate.cfg.xml
Support for query and entity cachingHibernate L2 Cache plugin
Servers understand the memcached wire protocol
Use any memcached clientMemcached protocol
<bean id="cacheManager" class="org.springframework.data.gemfire.support.GemfireCacheManager"Spring Cache Abstraction
18© Copyright 2015 Pivotal. All rights reserved.
Application Patterns
Caching for speed and scale– Read-through, Write-through, Write-behind
Geode as the OLTP system of record– Data in-memory for low latency, on disk for durability
Parallel compute engine
Real-time analytics
19© Copyright 2015 Pivotal. All rights reserved.
Development Patterns
Configure the cluster programmatically or declaratively using cache.xml, gfsh, or SpringDataGemFire
<cache> <region name="turbineSensorData" refid="PARTITION_PERSISTENT"> <partition-attributes redundant-copies="1" total-num-buckets="43"/> </region></cache>
20© Copyright 2015 Pivotal. All rights reserved.
Development Patterns
Write key-value data into a Region using { create | put | putAll | remove }
– The value can be flat data, nested objects, JSON, …
Region sensorData = cache.getRegion("TurbineSensorData"); SensorKey key = new SensorKey(31415926, "2013-05-19T19:22Z"); // turbineId, timestampsensorData.put(key, new TurbineReading() .setAmbientTemp(75) .setOperatingTemp(80) .setWindDirection(0) .setWindSpeed(30) .setPowerOutput(5000) .setRPM(5));
21© Copyright 2015 Pivotal. All rights reserved.
Development Patterns
Read values from a Region by key using { get | getAll }
TurbineReading data = sensorData.get(new SensorKey(31415926,"2013-05-19T19:22Z"));
Query values using OQL
// finds all sensor readings for the given turbine SELECT * from /TurbineSensorData.entrySet WHERE key.turbineId = 31415926
// finds all sensor readings where the operating temp exceeds a threshold SELECT * from /TurbineSensorData.entrySet WHERE value.operatingTemp > 120
Apply indexes to optimize queries
22© Copyright 2015 Pivotal. All rights reserved.
Development Patterns
Execute functions to operate on local data in parallel
Respond to updates using CacheListeners and Events
Automatic redundancy, partitioning, distribution, consistency, network partition detection & recovery, load balancing, …
23© Copyright 2015 Pivotal. All rights reserved. 23© Copyright 2015 Pivotal. All rights reserved.
Apache Geode (incubating)The open source core of GemFire
24© Copyright 2015 Pivotal. All rights reserved.
Why OSS? Why Now? Why Apache?
Open Source Software is fundamentally changing buying patterns– Developers have to endorse product selection (No longer CIO handshake)– Community endorsement is key to product visibility– Open source credentials attract the best developers– Vendor credibility directly tied to street credibility of product
Align with the tides of history– Customers increasingly asking to participate in product development– Allow product development to happen with full transparency
Apache is where you go to build Open Source street cred– Transparent, meritocracy which puts developers in charge– Roman keeps shouting “Apache!” every few hours
25© Copyright 2015 Pivotal. All rights reserved.
Geode Will Be A Significant Apache Project
Over a 1000 person years invested into cutting edge R&D
Thousands of production customers in very demanding verticals
Cutting edge use cases that have shaped product thinking
Tens of thousands of distributed , scaled up tests that can randomize every aspect of the product
A core technology team that has stayed together since founding
Performance differentiators that are baked into every aspect of the product
26© Copyright 2015 Pivotal. All rights reserved.
Geode or GemFire?
Geode is a project, GemFire is a product
We donated everything but the kitchen sink*
More code drops imminent; going forward all development happens OSS-style (“The Apache Way”)
* Multi-site WAN replication, continuous queries, and native (C/C++) client driver
30© Copyright 2015 Pivotal. All rights reserved. 30© Copyright 2015 Pivotal. All rights reserved.
Geode Demo
31© Copyright 2015 Pivotal. All rights reserved.
Post RegionPartitioned
People RegionPartitioned
Social Network
Person
Name: StringDescription:String
Post
Id: PostId(name, date)Text: String
32© Copyright 2015 Pivotal. All rights reserved.
Partition put
Client
Server 1
Server 2
Server 3
Bucket 1
Bucket 1
Bucket 2
Bucket 2
#(LOL)=1Put LOL
33© Copyright 2015 Pivotal. All rights reserved.
Partition put
Client
Server 1
Server 2
Server 3
LOL
LOL
Bucket 2
Bucket 2
ReplicateTo Secondary
34© Copyright 2015 Pivotal. All rights reserved.
“User” Use Case – Save Objectspublic interface PersonRepository extends CrudRepository<Person, String> {}
@AutowiredPersonRepository people;
public static void main(String[] args) { people.save(new Person(name)); posts.save(new Post(new PostId(name, date), text));}
Nested Objects,Compound Keys
35© Copyright 2015 Pivotal. All rights reserved.
“User” Use Case – Save Objectspublic interface PersonRepository extends CrudRepository<Person, String> {}
@AutowiredPersonRepository people;
public static void main(String[] args) { people.save(new Person(name)); posts.save(new Post(new PostId(name, date), text));}
Automatically SerializedWith PDX
36© Copyright 2015 Pivotal. All rights reserved.
Configuration
<gfe:partitioned-region id="people" copies="1"/>
<bean id="pdxSerializer” class="com.gemstone.gemfire.pdx.ReflectionBasedAutoSerializer">
<constructor-arg value="io.pivotal.happysocial.model.*"/></bean>
<gfe:cache pdx-serializer-ref="pdxSerializer"/>
37© Copyright 2015 Pivotal. All rights reserved.
Data Analyst – Determine Sentiment
Find all of the posts for a user
Analyze their content
38© Copyright 2015 Pivotal. All rights reserved.
First try – Just use a Query
public interface PostRepository extends GemfireRepository<Post, PostId> {
@Query("select * from /posts where id.person=$1") public Collection<Post> findPosts(String personName);}
Collection<Post> posts = postRepository.findPosts(personName);String sentiment = sentimentAnalyzer.analyze(posts);
39© Copyright 2015 Pivotal. All rights reserved.
First try – Just use a Query
public interface PostRepository extends GemfireRepository<Post, PostId> {
@Query("select * from /posts where id.person=$1") public Collection<Post> findPosts(String personName);}
Collection<Post> posts = postRepository.findPosts(personName);String sentiment = sentimentAnalyzer.analyze(posts);
Query Nested Objects
40© Copyright 2015 Pivotal. All rights reserved.
Use an Index
<gfe:index id="postAuthor" expression="id.person" from="/posts”
/>
41© Copyright 2015 Pivotal. All rights reserved.
Still could be more efficient
Client
Server 1
Server 2
Server 3
Joe: LOL!!
Joe: LOL!!
EJ: arrg
Maya: Hii
Jess: sup
Jess: ok
Hitting multipleNodes
Bringing too muchData to the client
42© Copyright 2015 Pivotal. All rights reserved.
Colocate the data
Client
Server 1
Server 2
Server 3
Joe: LOL!! Joe: LOL!!
EJ: arrgMaya: Hii
Jess: sup Jess: ok
<gfe:partitioned-region id="posts" copies="1" colocated-with="people”>
<gfe:partition-resolver ref="partitionResolver"/>
</gfe:partitioned-region>
43© Copyright 2015 Pivotal. All rights reserved.
Send behavior to data
Client
Server 1
Server 2
Server 3
Joe: LOL!! Joe: LOL!!
EJ: arrgMaya: Hii
Jess: sup Jess: ok
Execution functiongetSentimentOn Joe, Jess
Execute on Joe
Execute on Jess
44© Copyright 2015 Pivotal. All rights reserved.
Sample Function – Client Side
@Component@OnRegion(region = "posts")public interface FunctionClient { public List<SentimentResult> getSentiment(@Filter Set<String> people);}
45© Copyright 2015 Pivotal. All rights reserved.
Sample Function – Server Side
@GemfireFunction(HA=true)public SentimentResult getSentiment(Region<PostId, Post> localPosts, @Filter Set<String> personNames) throws Exception {
String personName = personNames.iterator().next(); Collection<Post> posts = localPosts.query("id.person='" personName + "'");
String sentiment = sentimentAnalyzer.analyze(posts); return new SentimentResult(sentiment, personName);}
46© Copyright 2015 Pivotal. All rights reserved.
Sample Function – Server Side
@GemfireFunction(HA=true)public SentimentResult getSentiment(Region<PostId, Post> localPosts, @Filter Set<String> personNames) throws Exception {
String personName = personNames.iterator().next(); Collection<Post> posts = localPosts.query("id.person='" personName + "'"); String sentiment = sentimentAnalyzer.analyze(posts); return new SentimentResult(sentiment, personName);}
47© Copyright 2015 Pivotal. All rights reserved.
Sample Function – Server Side
@GemfireFunction(HA=true)public SentimentResult getSentiment(Region<PostId, Post> localPosts, @Filter Set<String> personNames) throws Exception {
String personName = personNames.iterator().next(); Collection<Post> posts = localPosts.query("id.person='" personName + "'");
String sentiment = sentimentAnalyzer.analyze(posts); return new SentimentResult(sentiment, personName);}
48© Copyright 2015 Pivotal. All rights reserved. 48© Copyright 2015 Pivotal. All rights reserved.
Demo
49© Copyright 2015 Pivotal. All rights reserved. 49© Copyright 2015 Pivotal. All rights reserved.
Design• Persistence
• PDX Serialization
50© Copyright 2015 Pivotal. All rights reserved.
Persistence – Design Goals
High availability
Low latency, high throughput
Durability
Minimal impact to in memory speed– Minimize IO operations– Minimize locking– Minimize deserialization
51© Copyright 2015 Pivotal. All rights reserved.
Persistence – Shared NothingServer 3Server 2
Bucket 3*primary*
Bucket 1secondary
Bucket 3secondary
Bucket 2*primary*
Server 1
Bucket 2secondary
Bucket 1*primary*
52© Copyright 2015 Pivotal. All rights reserved.
Persistence – Shared NothingServer 1 Server 3Server 2
Bucket 2secondary
Bucket 1*primary*
Bucket 3*primary*
Bucket 1secondary
Bucket 3secondary
Bucket 2*primary*
What if a server crashes?
53© Copyright 2015 Pivotal. All rights reserved.
Persistence – Shared NothingServer 1 Server 3Server 2
Bucket 2secondary
Bucket 1*primary*
Bucket 3*primary*
Bucket 1*primary*
Bucket 3secondary
Bucket 2*primary*
Bucket 2secondary
Primary Failover
RestoreRedundancy
Bucket 1secondary
54© Copyright 2015 Pivotal. All rights reserved.
Persistence – Shared NothingServer 1 Server 3Server 2
Bucket 2secondary
Bucket 1*primary*
Bucket 3*primary*
Bucket 1*primary*
Bucket 3secondary
Bucket 2*primary*
Bucket 2secondary
Bucket 1secondary
What if they all crash?
55© Copyright 2015 Pivotal. All rights reserved.
Persistence – Shared NothingServer 3Server 2
Bucket 3*primary*
Bucket 1*primary*
Bucket 3secondary
Bucket 2*primary*
Bucket 2secondary
Bucket 1secondary
Server 1
Bucket 2secondary
Bucket 1*primary*
Compare Persistent ViewOn Restart
56© Copyright 2015 Pivotal. All rights reserved.
Persistence – Shared NothingServer 1
Bucket 2secondary
Bucket 1*primary*
Server 3Server 2
Bucket 3*primary*
Bucket 1secondary
Bucket 3secondary
Bucket 2*primary*
Bucket 2secondary
Bucket 1secondary
Recover Latest Data &
Rebalance
57© Copyright 2015 Pivotal. All rights reserved.
Modify k1->v5
Create k6->v6
Create k1->v1
Create k2->v2
Modifyk1->v3
Create k4->v4
Modify k1->v5
Create k6->v6
Member 1
Persistence - Operation Logs
Put k6->v6
Oplog2.crf
Oplog1.crf
Append to operation log
58© Copyright 2015 Pivotal. All rights reserved.
Modify k1->v5
Create k6->v6
Create k1->v1
Modifyk1->v3
Member 1
Persistence - Compaction
Oplog2.crf
Oplog1.crf
Create k2->v2
Create k4->v4
Copy live data forward
59© Copyright 2015 Pivotal. All rights reserved.
Modify k1->v5
Create k6->v6
Persistence - Compaction
Create k2->v2
Create k4->v4
Oplog2.crf
Member 1
Modify k4->v7Oplog3.crf
Put k4->v7
60© Copyright 2015 Pivotal. All rights reserved. 60© Copyright 2013 Pivotal. All rights reserved.
Design• Persistence
• PDX Serialization
61© Copyright 2015 Pivotal. All rights reserved.
Fixed or flexible schema?
id name age pet_id
{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}
OR
62© Copyright 2015 Pivotal. All rights reserved.
But how to serialize data?
http://blog.pivotal.io/pivotal/products/data-serialization-how-to-run-multiple-big-data-apps-at-once-with-gemfire
63© Copyright 2015 Pivotal. All rights reserved.
Portable Data eXchange
C#, C++, Java, JSON
| header | data || pdx | length | dsid | typeId | fields | offsets |
64© Copyright 2015 Pivotal. All rights reserved.
Efficient for queries
SELECT p.name from /Person p WHERE p.pet.type = “dino”
{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}
single field deserialization
65© Copyright 2015 Pivotal. All rights reserved.
Easy to use
Access from Java, C#, C++, JSON
Domain objects not required
Automatic type definition– No IDL compiler or schema required– No hand-coded read/write methods
66© Copyright 2015 Pivotal. All rights reserved.
Distributed type registry
Member A Member B
Distributed Type Definitions
Person p1 = …region.put(“Fred”, p1);
67© Copyright 2015 Pivotal. All rights reserved.
Distributed type registry
Member A Member B
Distributed Type Definitions
Person p1 = …region.put(“Fred”, p1);
automatic definition
68© Copyright 2015 Pivotal. All rights reserved.
Distributed type registry
Member A Member B
Distributed Type Definitions
Person p1 = …region.put(“Fred”, p1);
automatic definition
69© Copyright 2015 Pivotal. All rights reserved.
Distributed type registry
Member A Member B
Distributed Type Definitions
Person p1 = …region.put(“Fred”, p1);
automatic definition
replicate serialized data containing
typeId
70© Copyright 2015 Pivotal. All rights reserved.
Schema evolution
PDX provides forwards and backwards compatibility, no code required
Member A Member B
Distributed Type Definitions
v1
v2
Application #2
Application #1
v2 objects preserve data from missing fields
v1 objects use default values to fill in new fields
71© Copyright 2015 Pivotal. All rights reserved.
Questions?
http://geode.incubator.apache.org
http://github.com/apache/incubator-geode
72© Copyright 2015 Pivotal. All rights reserved. 72© Copyright 2013 Pivotal. All rights reserved.
Bonus Content
73© Copyright 2015 Pivotal. All rights reserved. 73© Copyright 2015 Pivotal. All rights reserved.
Design• Persistence
• Asynchronous Events
• Client Subscriptions
• PDX Serialization
74© Copyright 2015 Pivotal. All rights reserved.
Asynchronous Events – Design Goals
High availability
Low latency, high throughput
Deliver events to a receiver without impacting the write path
75© Copyright 2015 Pivotal. All rights reserved.
Member 3
Member 1
Serial Queues
Member 2
LOL!!put
Primary Queue
Secondary Queue
Enqueue
sup
LOL!!
supsup
LOL!!
Replicate
76© Copyright 2015 Pivotal. All rights reserved.
Member 3
Member 1
Member 2
Serial Queues
sup
LOL!!
sup
LOL!!
AsyncEventListener { processBatch()}
LOL!!
sup
Dispatch Events from Primary
77© Copyright 2015 Pivotal. All rights reserved.
put
Secondary Queue(Partition 1)
LOL!!
sup
sup
LOL!!
Primary Queue (Partition 2)
Parallel Queues
Primary Queue (Partition 1)
LOL!!
sup
sup
LOL!!
Word
Word
78© Copyright 2015 Pivotal. All rights reserved.
LOL!!
sup
sup
LOL!!
Parallel Queues
LOL!!
sup
sup
LOL!!
Word
WordAsyncEventListener { processBatch()}
AsyncEventListener { processBatch()}
ParallelDispatch
79© Copyright 2015 Pivotal. All rights reserved. 79© Copyright 2013 Pivotal. All rights reserved.
Design• Persistence
• Asynchronous Events
• Client Subscriptions
• PDX Serialization
80© Copyright 2015 Pivotal. All rights reserved.
“There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one
errors.”
– the internet
81© Copyright 2015 Pivotal. All rights reserved.
Client driver
Intelligent – understands data distribution for single hop network access
Caching – can be configured to locally cache data for even faster access
Events – registerInterest() in keys to receive push notifications
82© Copyright 2015 Pivotal. All rights reserved.
Client subscriptions
Useful for refreshing client cache, pushing events (“topics”) to multiple clients
Highly available & scalable via in-memory replicated queues
Events are ordered, at-least-once delivery
Durable subscriptions, conflation optional
83© Copyright 2015 Pivotal. All rights reserved.
Client subscriptions
Member A Member B
region.registerInterest(“Fred”);
Client
Client
84© Copyright 2015 Pivotal. All rights reserved.
Client subscriptions
Member A Member B
Person p1 = …region.put(“Fred”, p1);
region.registerInterest(“Fred”);
Client
Client
85© Copyright 2015 Pivotal. All rights reserved.
Client subscriptions
Member A Member B
Person p1 = …region.put(“Fred”, p1);
region.registerInterest(“Fred”);
Client
Client
86© Copyright 2015 Pivotal. All rights reserved.
Client subscriptions
Member A Member B
Person p1 = …region.put(“Fred”, p1);
region.registerInterest(“Fred”);
Client
Client