virtualizing latency sensitive workloads and vfabric gemfire

VMware vFabric™ GemFire®

Virtualizing Latency Sensitive Workloads and vFabric GemFire – PEX 2012

Emad Benjamin – Staff Architect

Agenda The Data Challenge and Latency Sensitive Workloads VMware vFabric Cloud Application Platform High Performance Data with vFabric GemFire Primary GemFire Topologies and Usage Design and Sizing Best Practices Customer Case Study Next Steps

The Data Challenge and Latency Sensitive Workloads

Data Challenges in Modern Application Architectures

Explosive data growth• 60% year over year

Bridging data supply with data demand• Indeterminate user load, 24x7 access, new device types driving increased

application use

Business challenges• How to outpace competitors by delivering superior service and experience

IT challenges• Scalability

• Performance

• Data reliability

• Geographic distribution

Latency Sensitive

10ms to 100ms matter?• Then this is a latency sensitive application

• High chatter between VMs – many small data packets – many updates

VMware vFabricCloud Application Platform

VMware Cloud Application Platform

Virtual Datacenter Cloud Infrastructure and Management

RichWeb

ProgrammingModel

Social and Mobile

DataAccess

IntegrationPatterns

BatchFramework

WaveMakerSpring Tool Suite

CloudFoundry

App Monitoring(Spring Insight)

Performance Mgmt(Hyperic)

Automated App Provisioning(AppDirector)

JavaOptimizations(EM4J, …)

Java Runtime(tc Server)

Web Runtime(ERS)

Messaging(RabbitMQ)

Global Data(GemFire)

In-mem SQL(SQLFire)

High Performance Data with vFabric GemFire

Your Apps Are Cloud-Friendly… but What About Your Data?

The big glaring hole [with cloud] is data handling.

-Adrian Kunzle, MDHead of Engineering & Architecture, JPMorgan Chase

“ ”

File Systems Databases Other Systems

What’s the Problem?

How do youscale

the data tier?

What is VMware vFabric GemFire?

Data moves to the middle tier• Closer to where it is needed

Scalability• Easily accommodate more

application users

High performance• Dramatic application performance

gains – execute from memory

Data reliability• Data written through or behind

to disk

Geographic distribution• WAN connectivity

vFabric GemFire in a Nutshell

Databases Other Data SystemsFile Systems

Conventional Data Storage Systems

vFabric GemFire Data Fabric

High Throughput Low Latency High Scalability Continuous Availability

Reliable Event Notification Continuous Querying Parallel Execution

WAN Distribution

Enterprise Data Consuming Applications

Data Durability

Enabling Extreme Data Scalability and Elasticity

Application Data Lives Here

File Systems Databases Mainframes/other

Application Data Sleeps Here

Primary Use Cases Web session cache, L2 cache• Shopping cart state management

App data cache, in-memory DB • High performance OLTP

Grid data fabric: client compute• Shared data grid accessed by many

clients executing app logic

Grid data fabric: fabric compute• Shared data grid where app logic is

executed within the data fabric itself

GemFire Features

Rich objects

Replicated master data

Partitioned active data

Co-located active data

Ultra-fast co-located transactions

Distributed transactions*

Server-side event listeners

Ultra-low latency RAM durability

Client-side durable subscriptions Parallel MapReduce function execution

Java, C++, .NET

Customer

Address

Street

Preferences

Customers

Orders

Products

Promotions

Redundant Copies

* Available in v 7.0

OQL OQLOQL

Java, C++, .NET

Redundancy for instant FT

Continuous queries

Parallel OQL queries

LRU overflow to disk in native format for fast retrieval

Parallel, shared nothing persistence to disk with online backup

Synchronous or asynchronous write-through, read-through

Uni- or bi-directional cluster synchronization over WAN

Java O-R Mapper

Elastic growth without pausing

Update

Regions

Update

Java O-R Mapper

Request

Primary GemFire Topologiesand Usage

Primary GemFire Topologies

Peer-to-peer• Intercommunicating set of vFabric GemFire servers that do not have clients

accessing them

• For example, back office, or backend type of processing

Peer GemFire Server 1

PeerGemFire Server 2

DistributedSystem

Primary GemFire Topologies

Client/Server is the most common topology used in practice

GemFireServer 1 GemFire Server 2

DistributedSystem

StandaloneClient Cache 1

Client Tier

Server Tier

Primary GemFire Topologies – Global Multisite

GemFire 1

GemFire2Standby Gateway

DistributedSystem

GemFire4Gateway

GemFire3

New York Site

GemFire5

Standby Gateway

GemFire6

DistributedSystem

GemFire7GemFire8Gateway

Tokyo Site

GemFire12StandbyGateway

GemFire9Gateway

DistributedSystem

GemFire 10GemFire 11

London Site

Primary Gateway Paths

Standby GatewayPaths

Primary GemFire Usage – Hibernate Cache

Primary GemFire Usage – HTTP Session Management

Design and Sizing

Design and Sizing – Three Basic Steps

Step 1 Determine vFabric GemFire server JVM heap size needed to house region data for both RR and PR regions

Step 2Benchmark vertical scalability to determine VM size for GemFire server needed for the building-block VM

Step 3Benchmark horizontal scalability to determine how many vFabric GemFire servers are needed in a cluster

Design and Sizing – Understanding JVM Memory Segments

JVM MaxHeap-Xmx

JVM Memory

for GemFire

Perm Gen

Initial Heap

Guest OSMemory

VM Memory

for GemFire

Java Stack-Xss per thread

-XX:MaxPermSize

Other mem

Direct NativeMemory

Non Direct MemoryVirtualAddress Space

Design and Sizing – Understanding JVM Memory Segments

Guest OS Memory approximately 0.5-1G (depends on OS/other processes)

Perm Size is an area additional to the –Xmx (Max Heap) value and is not GC-ed because it contains class-level information.

“other mem” is additional mem required for NIO buffers, JIT code cache, classloaders, socket buffers (receive/send), JNI, GC internal info

VM Memory for GemFire = Guest OS Memory + JVM Memory for GemFireJVM Memory for GemFire =

JVM Max Heap (-Xmx value) +

JVM Perm Size (-XX:MaxPermSize) +

NumberOfConcurrentThreads * (-Xss) + “other Mem”

Design and Sizing – Step 1: Calculating Region Data

Formula 1

NumberOfGemFireServers=NumberOfVMsInSystem=NumberOfJVMsInSystem= TotalMemoryPerGemFireSystemWithHeadRoom /

TotalMemoryPerGemFireSystemWithHeadRoom = TotalMemoryPerGemFireSystem * 1.5

Formula 2

TotalMemoryPerGemFireSystem = TotalOfAllMemoryForAllRegions +

TotalOfAllMemoryForIndicesInAllRegions + TotalMemoryForSocketsAndThreads

Formula 3

Design and Sizing – Step 1: Calculating Region Data (cont.)

Formula 4

ApproxServerMachineRAM= TotalMemoryPerGemFireSystemWithHeadRoom * (DataLossTolerancePercentage/ (NumberOfRedundantCopies +1) )

JVM MaxHeap-Xmx

(29696m)

Perm Gen

Initial Heap

Guest OSMemory

-Xms (29696m)

Java Stack -Xss per thread (192k*100)

-XX:MaxPermSize (256m)

Other mem (=1484m)

500m used by OS

Set mem reservation to 31955m or set to Active

memory used by VM which could be lower

JVM Memory

for GemFire

(31455m)

VM Memory

for GemFire(31955)

What is the practical limit for JVM Memory sizing (not to scale)

16 Exa Bytes

64 bit JavaTheoretical

Guest OS Limit

1 to 16 TB ESX5i limit32vCPU

1TB RAM Physical Server limit

~256G<1TB

Per NUMA RAM

Most limiting practical sizing factor is the per NUMA node

Design and Sizing – NUMA Considerations

NUMA Node Local Mem = Total RAM on Server/Number of NUMA nodes

For Example 1• Assume 2 sockets server with 8 cores (8pCPU) and total of 196GB RAM

• This server has 2 NUMA nodes

• Each NUMA node will have 196GB/2=> 98GB RAM

• Hence the largest sized virtual machine should not exceed 8vCPU and 98GB RAM

For Example 2• 2 sockets quad core on each socket (4pCPU) and total of 64GB

• Each NUMA node would get 64/2=> 32GB

• Hence the largest GemFire virtual machine should be sized as 4vCPU and 32GB RAM

Proc 2

Proc 1

128 GB RAM on Server

Each NUMANode has 128/264GB

MemoryMemory

Memory

2vCPU VMsLess than 32GB RAM on each VM

ESX Scheduler

Proc 2

Proc 1

MemoryMemory

Memory

ESX Scheduler

4vCPU VM

Split by ESX into 2 NUMA ClientsESX4.1

Proc 2

Proc 1

MemoryMemory

Memory

DETERMINE HOW MANY VMsEstablish Horizontal

ScalabilityScale Out Test How many VMs do you need to

meet your Response Time SLAs without reaching 70%-80% saturation of CPU?

Establish your Horizontal scalability Factor before bottleneck appear in your application

Horizontal Scalability Test

Building Block VM

SLA OK?

Test complete

Investigate bottlenecked layer

Network, Storage, Application Configuration,

and vSphere

If horizontal scaling is bottlenecked

mitigate, and iterate scale out

If building block app/VM config problem, adjust

and iterate No

ESTABLISH vFabric GemFire BUILDING BLOCK VM

Size within NUMA boundaries of ESX host

Establish JVM Heap Size Size the Building Block VM that

houses vFabric GemFire Server

Building Block VM

Step 3 – Iterative Horizontal Scalability

Step 2 and Step 3: Establish Benchmark Vertical Scalability

Formula 6 – for Global Multisite TopologyMaximum Throughput (bits/second) =

TCP-Windows-Size In Bits / Round Trip Latency in Seconds

Use WAN accelerators

Best Practices

vFabric GemFire on VMware – Best Practices

Best Practices paper here:• http://www.vmware.com/resources/techresources/10231

vFabric GemFire on VMware• Set appropriate memory reservation

• Leave HT enabled, size bases on vCPU=1.25pCPU if needed

• RHEL6 and SLES 11 SP1 have tickless kernel that does not rely on a high frequency interrupt-based timer, and is therefore much friendlier to virtualized latency-sensitive workloads

• Do not overcommit memory

vFabric GemFire on VMware• Put vSphere Distributed Resource Scheduler (DRS) in manual mode

• Locators process should not be VMware vSphere® vMotion® migration, it otherwise would lead to network split brain problems

• vMotion over 10Gbps when doing scheduled maintenance

• Disable VMware HA

• Use Affinity and Anti-Affinity rules to avoid redundant copies on the same VMware ESX®/ESXi host

Proc 2

Proc 1

MemoryMemory

Memory

data Many enterprise appsconsuming data from GemFire and running within NUMA boundary

GemFire VM running within NUMA boundary

vFabric GemFire on VMware• Disable NIC interrupt coalescing on physical and virtual NIC

• Extremely helpful in reducing latency for latency-sensitive virtual machines• Disable virtual interrupt coalescing for VMXNET3 • It can lead to some performance penalties for other virtual machines on the ESXi host,

as well as higher CPU utilization to deal with the higher rate of interrupts from the physical NIC

• This implies it is best to use dedicated ESX cluster for vFabric GemFire workloads

• All host is configured the same way for latency sensitivity and this insures non GemFire workloads are not negatively impacted

vFabric GemFire on VMware – JVM tuning• Size with 50% headroom

• Use –XX:CompressedOops

• Use JDK 1.6.0_24 or later

• Set –Xms=-Xmx

• Use –XX:+UseConcMarkSweepGC low-pause collector and parallel Young Generation

vFabric GemFire on VMware – JVM tuning• -XX:+DisableExplicitGC

• -XX:CMSInitiatingOccupancyFraction=<50-75>

• -Xmn 33% of –Xmx and ideally less than a range of 2GB

vFabric GemFire on VMware – General• All peer-to-peer members of the distributed system must have the same

version of vFabric GemFire• Clients can be up to one major release behind. For example, any 6.x client

interoperates with any 6.x or 7.x server, but not with an 8.x server

• Set cache-server max-connections and max-threads

• Use GFMon and VSD tools for monitoring

• When troubleshooting performance problems, check to see you are not impacted by SYN cookies• SYN cookies are the key element of a technique used to guard against SYN

flood attacks. Daniel J. Bernstein, the technique's primary inventor, defines SYN cookies as “particular choices of initial TCP sequence numbers by TCP servers

Customer Case Study

Airline Industry

Client/Server topology Re-architecture of their

main Web store• To speed up search,

checkout/book process

In 2010• 80+ million passengers carried

• 12B in revenue

Next GenSessionServer

Client ClientClient

Number of servers per data center

Number of JVMs per server 1Heap Size per JVM -Xms34G and –Xmx34G

Available heap memory per JVM 34GB

Available RAM per JVM Includes 50% ratio for churn

Total RAM needed per data center

Getting Started – vmware.com/go/gemfire

Thank you! Any Questions?

You can buy my book here: https://www.createspace.com/3632131

Backup Slides

Storage Device

Database Node

Archival, OLAP and Regulatory RDBMS

Synchronous consistency within the fabric

Eventual consistency with archival database

Eventual consistency with other fabric instances

Data Fabric Node Data Fabric Node Data Fabric Node Data Fabric Node

Consistency Model

Memory-Based Performance

High Performance

vFabric GemFire uses memory on peer machines to make data updates durable, allowing the updating thread to return 10x to 100x faster than updates written through to disk, without risking any data loss. Typical latencies are in the few hundreds of microseconds instead of tens to hundreds of milliseconds

vFabric GemFire can optionally write updates to disk, or to a data warehouse, asynchronously and reliably

Cloud Ready

Add or remove data servers dynamically

Elastic

Fabric is elastic so it can grow or shrink dynamically with no interruption of service or data loss

Distributed Events

• Targeted, guaranteed delivery, event notification, and continuous queries

Active

Data Fabric Node Data Fabric Node Data Fabric Node Data Fabric Node

Counterparty Descriptions

Settlement Instructions

Netting Agreements

Replicated regions model many-to-many relationships

Many-to-many, many-to-one, and one-to-many relationships can be modeled Co-location of related data eliminates distributed transactions All entities within the transaction are located on a single machine Targeted procedures have all the data entities they need locally

Partitioning and Co-Location Example

Data Fabric NodeData Fabric Node Data Fabric Node Data Fabric Node

Partitioned Data

Partitioned regionsmodel one-to-many and many-to-one

Position Data

Trade Data

Market Data

Instrument Data

Rating Information

Many-to-many, many-to-one and one-to-many relationships can be modeled Co-location of related data eliminates distributed transactions All entities within the transaction are located on a single machine Targeted procedures have all the data entities they need locally

Partitioning and Co-Location Example

Parallel Queries

Batch Controller or Client

Scatter-Gather (Map-Reduce) Queries and Functions

Parallel

Fault Tolerant, Data-Aware Function Routing

Targeted

vFabric GemFire provides “data aware function routing”—moving the behavior to the correct data instead of moving the data to the behavior

Batch Controller or Client

Data Aware Function

Multisite Capability

Data replication for disaster recovery is done with the fault-tolerant, bi-directional shared-nothing, store-and-forward gateways

Active Everywhere

Data Distribution

Distribute

vFabric GemFire can keep clusters that are distributed around the world “eventually consistent” in near real-time and can operate reliably in disconnected, intermittent, and low-bandwidth network environments

Formula 5

TotalMemoryForSocketsAndThreads = TotalMemoryForSockets + TotalMemoryForThreadOverhead TotalMemoryForThreadOverhead = MaxClientThreads * ThreadStackSize TotalMemoryForSockets = TotaNumbrOfsockets * SocketBufferSizeBytes TotalNumberOfSockets = NumberOfServers * NumberOfThreadsOnServer

+ AppThreads+ MaxClientThreads+ MaxClientThreads * 2 *

NumberofServers *

IfHostPartitionedRegionAndConserveSocketsIsFalse

Proc 2

Proc 1

MemoryMemory

Memory

ESX Scheduler

12vCPU VM

Can be done through12 vCPU vSocket/vNUMAin ESX5

vSocket

Primary GemFire Usage – Hibernate Cache

Hibernate configuration• (hibernate.cfg.xml)

• Set region.factory_class to GemFireRegionFactory (hibernate.cfg.xml version 3.3+)<property name="hibernate.cache.region.factory_class">

com.gemstone.gemfire.modules.hibernate.GemFireRegionFactory

</property>

Enabling Extreme Data Scalability and Elasticity

Application Data Lives Here

File Systems Databases Mainframes/other

Application Data Sleeps Here

Key Capabilities Low-latency, linearly-scalable,

memory-based data fabric• Data distribution, replication,

partitioning, and co-location

• Pools memory and disk across many nodes

Data-aware execution• Move functionality to the data for peak

performance

Active/continuous querying and event notification• Changes are propagated to one or

more “active” copies

GemFire in Mission Critical Wall Street Applications

Reference data (top 3 US-based bank)• Large amounts of in-memory data, mostly static but some intraday updates

• 5x–10x performance increase

• Global distribution – consistent global views

• Domain-specific and regional edge caches

Market data (top 3 Japan-based financials firm)• Ultra low latency for value added “derived” market data

• Fault tolerant store-and-forward global data distribution

• Global consistency

Risk processing system (top 3 US-based bank)• Credit risk, market risk, trader risk

• Over 1TB of credit risk data processing

• Processing moving from batch toward real time

• Consistent snapshot of data across long running calculation/analysis

virtualizing latency sensitive workloads and vfabric gemfire

Technology

vmware vfabric gemfire - odbms. · pdf filevmware vfabric...

high performance data with vmware vfabric gemfire · pdf...

user's guide - pivotal gemfire®...

distributed data management with vmware vfabric · pdf...

spring data gemfire reference guide - spring framework ·...

vmware vfabric data director administrator and user guide...

using vmware vfabric postgres - vfabric postgres 9.3.5

gemfire introduction hands-on labs

distributed data management with vmware vfabric gemfire

evaluating the performance of data caching frameworks · pdf...

gridgain feature comparison vs gemfire

vfabric hyperic monitoring 4.6.6

using vmware vfabric postgres for data director - vfabric...

vfabric gemfire user's guide -...

vfabric deep dive - messaging

vfabric cloud application platform

vmware vfabric data director administrator and …...vmware...

using vmware vfabric postgres - vfabric postgres 9.3 · pdf...

pivotal gemfire® for pivotal cloud foundry 1 · installing...

successfully virtualizing business-critical applications ·...