[rightscale webinar] architecting databases in the cloud: how rightscale does it
Post on 14-Jun-2015
Embed Size (px)
DESCRIPTIONYour database is the foundation of your application. With cloud comes new advantages and considerations for architecting and deployment. Find out how RightScale uses SQL and NoSQL databases such as MySQL, MongoDB, and Cassandra to provide a scalable, distributed, and highly available service around the globe.
- 1. ARCHITECTING DATABASES FOR SCALABILITY & AVAILABILITY IN THE CLOUD: HOW RIGHTSCALE DOES IT
2. Josep Blanquer, Chief Architect, RightScaleRaphael Simon, Senior Systems Architect, RightScaleAli Khajeh-Hosseini, Director of Development, RightScaleQ&ABen Ingalls, Sales Development Representative, RightScalePlease use the Questions window to ask questions at any timeYour Panel Today2 3. Main Technologies UsedData Storage and Design for:Cloud ManagementSelf ServiceCloud AnalyticsConclusionsQ&AAgenda3 4. RightScale uses a mix of RDBMS and NoSQL technologies:MySQL , Cassandra, MongoDB, Redshift and S3The choice for each of them is commonly due to features such as:TransactionalityAvailabilityShardingQueryiabilityRaw performanceEtcIntro: Tools and Technologies 5. Strong ACID propertiesAvailability through async replication (for HA and DR)Read scalability through multiple slavesPowerful SQL queryiabilityExamples of data from our Cloud Management product:Users, Plans, SettingsPublished marketplace assetsLocal assets like:ServerTemplates, ScriptsDeployments and server configurationsAlert definitionsStrong Points: MySQL 6. High-availability propertiesDistributed, master-lessEasy to horizontally scale (automatic data sharding and rebalancing)Tunable replication (including multi-DC)Tunable consistencyTTL (Time To Live) in data elementsExamples from our Cloud Management product:EventsAuditsAcross-cloud message routingSession dataTagsStrong Points: Cassandra 7. Mostly offline data retrievalLarge scale and availabilityLarge amounts of dataWhen no querying is necessaryExamples from our Cloud Management productArchived audits (encrypted)Scraped git repositoriesArchived monitoring dataStrong Points: S3 8. Document oriented storageBuilt-in replication supportBuilt-in sharding supportTest and set queryExamples from our Self Service productCloud Application Templates (CATs)Catalog ApplicationsRunning ApplicationsStrong Points: MongoDB 9. Simple to get started and manageScales to handle up to a petabyte of dataPowerful SQL queryiability: we can explore the data easilyExamples from our Cloud Analytics productStoring years of usage, cost and pricing data, e.g.:Instance-id-1 with x, y, z params, launched on T1 and terminated at T2Price of instance-type-X with x, y, z params at T1 was $0.01Strong Points: Redshift 10. Lets take a peek at:How the data storage architecture is designedHow some of these these technologies are deployedWith examples in each of our three main products:Cloud ManagementSelf ServiceCloud AnalyticsStorage Architecture and Deployment10 11. Streamline OperationsStreamline operationsRightScale Cloud ManagementUnify management of compute, storage, and networkDesign portable, multi- cloud service configurationsOrchestrate large globally distributed systemsControl access across clouds, data centers, and tenants11 12. For a single accountGlobal, to all accountsData Accessibility and ScopeUsersInstancesData required by 13. UsersInstancesAccountX-Account 14. UsersInstancesAccountX-AccountglobalCustom replicationWhy custom? More controlMultiple sourcesIndividual columnsApply transformationsSmart re-sync featuresGlobal: MySQLACID semanticsMaster-Slave replication 15. UsersInstancesAccountX-AccountglobaldashS3eventstagsauditDashboard: MySQLACID semanticsMaster-SlaveN replicationSlave readsRows tagged by accountOther systems: CassandraSimpler Key-Value accessGreat scalabilityGreat replica controlHigh write availabilityTime-to-live expiration as cacheRows tagged by accountData archive: S3Low read rateGlobally accessible 16. UsersInstancesAccountX-AccountglobaldashS3eventstagsauditdasheventstagsauditSo we can horizontally scale our dashboard by partitioning objects based on account groups: Clusters 17. UsersAccountCluster 1dashS3eventstagsauditCluster NdashS3eventstagsauditAccount Set 1Account Set 2RightScale AccountsCluster 3dashS3eventstagsauditFeatures:1 cluster: N accounts1 account: 1 homeMigratable accounts Benefits:Great horizontal growthBetter failure isolationIndependent scaleLoad rebalancingVersionable codeDifferentiated service 18. UsersInstancesAccountX-AccountdasheventstagsauditglobaldashS3eventstagsauditroutingpollingmonitor 19. UsersInstancesAccountX-AccountdasheventstagsauditglobaldashS3eventstagsauditroutingpollingmonitorroutingpollingmonitorAnd partition our cloud objects based on the cloud the instances of an account run on: Islands 20. InstancesAccountCloud 1Cloud 2Cloud NServices co-locatedwith resourcesServices co-locatedwith resourcesServices co-locatedwith resourcesroutingpollingmonitorroutingpollingmonitorroutingpollingmonitorIsland 1Island 2Island NPolling Clouds: MySQLMaster-Slave replicationCan port to NoSQL easilyMostly a resource cacheBut cloud partitionableMonitoring: CustomReplicated filesBackup to S3Archive to S3Routing: CassandraSimpler Key-Value accessVery high availabilityGreat scalabilityGreat replica controlPlus cross DC replication* 21. UsersInstancesAccountCluster 1dashS3eventstagsauditCluster NdashS3eventstagsauditCluster 3dashS3eventstagsauditroutingpollingmonitorroutingpollingmonitorroutingpollingmonitorIsland 1Island 2Island NDifferent GeographiesDifferent CloudsWhat if the cloudwhere the clusteris deployed onFails? 22. 22UsersInstancesAccountCluster 1dashS3eventstagsauditCluster NdashS3eventstagsauditCluster 3dashS3eventstagsauditroutingpollingmonitorroutingpollingmonitorroutingpollingmonitorIsland 1Island 2Island NSister ClustersFull replicaFeatures:Each master has an extra remote slaveEach cluster in a pair is a DC replica of the others localring At Disaster Recovery time:Apps are told to start serving an extra shardNo need to provision more infrastructure to recover (try to avoid since everybody is on the same boat)New resources can be allocated over time to help offload existing ones 23. Increase innovationReduce development cycles and increase agilityEliminate manual work with automation and orchestrationDrive down spend with built-in cost controlsReduce risks with policy- based governanceRightScale Self-Service23 24. Self-Service deals with documents (CATs)AngularJS application built on top of REST APIJSON compatibilityHigh availability and good scalability with test and set building block queryNo built-in join but not neededUse case allows for heavy use of denormalizationpraxis-mapper for efficient client side joinsWhy MongoDB?24 25. 3 nodes MongoDB replica set per shardEach replica in its own AZSecurity groups for access controlWrite concern of 2Apps read from master (need consistency)BI, internal tools read from slavesSelf-Service HA (today)25 26. Hidden replica in different region (application does not send requests to hidden replicas)Deployments in VPCVPN between regionsSelf-Service DR (EOY)26 27. Optimize Cloud SpendOptimize cloud spendRightScale Cloud AnalyticsVisualize all your cloud costsForecast, budget, and optimize cloud costsOptimize your spend and reduce wasteImplement chargeback and showback with automated reports27 28. Cloud Analytics and Redshift28Data sourcesData sourcesData sourcesData fetching jobsCSV files on S3Redshift cluster 1Redshift cluster 2Redshift cluster NServers that read and process dataData load jobsWrite to all clustersRandomly pick onecluster and read from itServers that read and process dataServers that read and process data 29. Each Redshift clusters is deployed in one availability zone, what if that AZ has issues, or if the cluster goes offline?Our architecture makes it easy to have replicas as there is a single data stream of changes, which can be written to all clustersSacrificed consistency across clusters for increased availability and scalabilityIf one AZ has issues:Writes to clusters get delayed until the AZ is online or we take the affected cluster offlineReads from clusters continue to work as servers can connect to another clusterWe run a create replica rake task that stops all the writes, takes a snapshot from a working cluster, and creates a new cluster on a different AZRedshift HA29 30. Redshift supports a copy snapshot to different region functionalityA new cluster can be created from a snapshotCluster configs are not stored in the snapshot and need to be configuredEC2 instances connect to Redshift using security groups, but the instances and the cluster must be in the same region for the security groups to workWe use Cloud Managements monitoring system to monitor health and other metrics of clusters, and alert on themRedshift DR30 31. Shown how RightScale uses several database technologiesFor well-known relational data: MySQL (with high replication)For archiving and blob storage we use S3For very High-Availability and geo-replication we use CassandraFor TTL support and fast writes we also use CassandraFor JSON documents we use Mongo (with sharding and replica-sets)For large data analytics we use AWS RedshiftConclusions31 32. Start a Free Trial of RightScale Todayhttps://www.rightscale.com/free-trialGet the White Paper Designing Private & Hybrid Cloudshttp://www.rightscale.com/lp/designing-private-hybrid-clouds-white-paperThank You and Q&A32 33. THANK YOU.33