scaling and managing big data apps in the cloud
DESCRIPTION
Title: Scaling and Managing Big Data Apps on Public Clouds Abstract: The massive computing and storage resources that are needed to support big data applications make on-demand, elastic cloud environments an ideal fit. However, managing your big data app on the cloud is no walk in the park - configuration, orchestration, H/A, auto-scaling are all quite complex when it comes to choosing the right cloud for you, whether it’s public, private or a hybrid cloud - which is where Cloudify and Eucalyptus come together. In this session, you'll learn how to deploy, manage, monitor and scale your big data apps on the open source Eucalyptus cloud platform using Cloudify, as well as easily test drive your apps locally and then migrate the workload to Amazon Web Services EC2.TRANSCRIPT
Big Data In the
Cloud@natishalom
2
About GigaSpaces
Managing Big Data on the Cloud
100s of Enterprise Customers
My Data Out of My Hands..
No Way!
4
The Reality of Big Data..
2.7 ZB
0.5 Petabytes
66%
Global Digital Data
Two years tweets
Plan to use Big Data/Cloud
43% think that data
analytics could be improved in their organization if data analytics was part of
cloud services
Large ISV Case Study
• Application– Call Center surveillance
• Background– Previously – voice data
• Goal for a new system– Monitor data & voice– Multiple data sources – Advanced correlations
The Challenges...
Ever Growing Data
Deeper Correlation
Tight Performance
A Classic Case for...
A Typical Big Data System…
The Challenge
Cost Business Impact
Lower Margins
Competiveness
Time to Market
Customer Satisfaction
Infrastructure
Operational
The Solution Big Data
in the Cloud
Big Data in the Cloud - 3 Reasons
• Skills– Do you really need/want this all in-
house?• Huge amounts of external data – Does it make sense to move and
manage all this data behind your firewall?
• Focus on the value of your data– Instead of big data management
Holger Kisker
Managing Big Data on the
Cloud
• Auto start VMs• Install and configure
app components • Monitor • Repair • (Auto) Scale• Burst…
Big Data in the Cloud...
Reduce the Infrastructure Cost
Choose the Right Cloud for the Job
Use Eucalyptus for private data , AWS for sporadic workloads..
Big Data in the Cloud...
Reducing the Operational Complexity
• Consistent Management
• Automation Through the Entire Stack
Let’s Take a Closer Look…
Consistent Management
Portability
Automation
16
Consistent ManagementRecipes consistent description for running any app:
What middleware services to run Dependencies between services How to install services Where application and service binaries are When to spawn or terminate instances How to monitor each of the services
The Right Cloud for the Job (Cloud
Portability)
18
Choosing the Right Cloud for the Jobcompute { template "SMALL_LINUX"}
SMALL_LINUX : template imageId "us-east-1/ami-76f0061f“ remoteDirectory "/home/ec2-user/gs-files“ machineMemoryMB 1600 hardwareId "m1.small" locationId "us-east-1" localDirectory "upload" keyFile "myKeyFile.pem"
options ([ "securityGroups" : ["default"]as
String[], "keyPair" : "myKeyFile"])
overrides (["jclouds.ec2.ami-query":"",
"jclouds.ec2.cc-ami-query":""])privileged true
}
SMALL_LINUX : template{ imageId linuxImageId remoteDirectory "/home/user/gs-files" machineMemoryMB 1600 hardwareId “m1.medium” locationId “us-west-1” localDirectory "upload“ keyFile “myEucaKeyFile.pem” username "user" options ([ "securityGroups" : ["default"] as String[], "keyPair" : keyPair ]) overrides ([“endpoint” : “http://communitycloud.eucalyptus.com”]) privileged true}
Automation across the stack1 Upload your recipe.
2 Cloudify creates VM’s & installs agents
3 Agents install and manage your app
4 Cloudify automate the scaling
Big Data Apps, on Any Cloud, Your Way
Open Source (Apache2)
Big Data On Demand with CloudifyRelational DB Clusters NoSQL Clusters Hadoop
MySQL MongoDB Hadoop (Hive, Pig,..)
Postgress Cassandra Storm
Couchbase ZooKeeper
ElasticSearch
® Copyright 2011 Gigaspaces Ltd. All Rights Reserved
22
Demo Time: Storm Cluster
Large ISV Case Study
• Application– Call Center surveillance system
• Background– Previously – voice data
• Goal for a new systemMonitor data & voiceMultiple data sources Advanced correlations Mission
Accomplished
Additional Benefits
• True Cloud Economics
• One product -> Any Customer Environment
• Increased Agility
Thank You!
References: http://www.cloudifysource.org http://github.com/CloudifySource